Citation
Acoustic/Temporal characteristics of the pain cries of normal neonates

Material Information

Title:
Acoustic/Temporal characteristics of the pain cries of normal neonates
Creator:
Klepper, Brian Ross, 1952-
Publication Date:
Language:
English
Physical Description:
xi, 157 leaves : ill. ; 28 cm.

Subjects

Subjects / Keywords:
Newborn infants ( lcsh )
Neonatology ( lcsh )
Infants -- Crying ( lcsh )
Speech thesis Ph. D
Dissertations, Academic -- Speech -- UF
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1985.
Bibliography:
Bibliography: leaves 148-156.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Brian Ross Klepper.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
022937189 ( ALEPH )
14258152 ( OCLC )

Downloads

This item has the following downloads:


Full Text






















ACOUSTIC/TEMPORAL CHARACTERISTICS OF
THE PAIN CRIES OF NORMAL NEONATES







By

BRIAN ROSS KLEPPER
























A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA 1935
















ACKNOWLEDGEMENTS



It has been my extreme good fortune to undergo the training for this degree in the company of talented professionals, good friends and family. This project, and indeed my entire academic career, has been patiently endured and urged on by many people.

My wife, Merlynn, has rowed with me through it all, has

remembered the purposes, and believed in the worth of the crossing.

My parents have remained steadfastly supportive and ready to help at every moment. From them I have learned patience, compassion and the value of family.

Bob never failed to provide insight and analysis, seeing into the heart of structure and context. His friendship, great integrity and compassion have framed every situation of my adult life.

Dr. Susan Armstrong generously served as Brazelton evaluator and provided a developmental perspective that was invaluable. Her constant support and devotion to the ideas embodied by this study truly allowed it to come to fruition.

Dr. Gene Cranston Anderson's orientation to the dynamics of

crying facilitated a leap in my own understanding of the implications of stress and infant vulnerability.

Drs. Ken Gerhardt, Howard Rothman, Sam Brown, Keith Berg and Richard Silverman have always been available for good counsel and support. Through their clear approaches to science and scholarship, each enriched my academic experience.

ii










The people of IASCP--Kathy Farley, Norman Green, Stephanie

Baldwin, Dr. Jiml Hicks, Dr. Don Teas. Dr. Paul Moore and Dr. Kathy Berg--have been helpful and cooperative friends for my entire stay here.

Finally, Dr. Harry Hoilien first precipitated ny coming to graduate school and saw me through to final closure.

To each of these people, I give my heartfelt thanks.
















































111
















TABLE OF CONTENTS

Page
ACKNOWLEDGEMENTS. . . ii

LIST OF TABLES. . vii

LIST OF FIGURES. . . . ix

ABSTRACT. . . . .

CHAPTERS

I INTRODUCTION . . . . . . . . . . . 1
Perspectives . . . . . . . . . . 1
The Physiological Dynamics of Crying . . . . 3
Evaluation of the Cry Signal . . . . . . 7
Perceptual Analyses of Cry . . . . . . 7
Information conveyed by the cry. . . . . 8 Perceptions of health state from the cry . . 9
Acoustic/Temporal Studies of Cry . . . . . 12
Fundamental frequency (FO) . . . . . 14
Jitter . . . . . . . . . . 21
Formant frequencies. . . . . . . . 22
Spectral distribution of energy level. . . 25 Intensity analyses of cry. . . . . . 26
Temporal analyses of cry . . . . . . 28
Latency . . . . . . . . . 28
Duration . . . . . . . . . 31
Summary of Previous Studies. . . . . . . 33
Specific Aims. . . . . . . . . . . 34

II METHOD . . . . . . . . . . . . 38
Overview . . . . . . . . . . . 38
Subject Selection. . . . . . . . . . 38
Procedure. . . . . . . . . . . 40
Determiination of Normality . . . . . . 40
The parent interview . . . . . . . 40
Appropriate fetal growth . . . . . . 41
Developmental status . . . . . . . 42
Collection of Cry Samples. . . . . . . 43
Elicitation of the pain cry. . . . . . 43
Recording procedures . . . . . . . 43
Analysis Procedures. . . . . . . . . 44
Acoustic and temporal analyses . . . . . 44
Crying fundamental frequency . . . . . 44 Vocal jitter . . . . . . . . 46
Formant frequencies. . . . . . . . 48
Long term spectral analysis. . . . . . 49

iv










Intensity. . . . . . . . . . 49
Timing . . . . . . . . . . 50
Perceptual analyses. . .53
Statistical Analysis Proceudres. .54

III RESULTS. . . . . . . . . . . . . 56
Acoustical Analyses. . . . . . . . . 56
Fundauental Frequency. . . . . . . . 56
Jitter . . *. . . . . . . . . 65
Foriaant Frequencies. . . . . . . . . 66
Long Term Spectral Distribution. . . . . 69
Intensity. . . . . . . . . . . 77
Temporal Analyses. . . . . . . . . . 77
Latency. . . . . . . . . . . 77
Duration . . . . . . . . . . 82
The Perceptual Analyses. . . . . . . . 83
Author's judgements of "cry quality"
characteristics . . . . . . . 83
Roughness. . . . . . . . . . 83
Vocal fry. . . . . . . . . . 83
Variations in frequency. . . . . . . 83
Register shifts. . . . . . . . . 84
Perceptions of roughness and strain. . . . 84
Summary. . . . . . . . . . . . 86
Fundamental Frequency. . . . . . . . 86
Jitter . . . . . . . . . . . 87
Formant Frequencies. . . . . . . . . 87
Long Term Spectra Distribution . . . . . 87
Peak Intensity Level . . . . . . . . 88
Timing . . . . . . . . . . . 88
Latency. . . . . . . . . . . 38
Duration . . . . . . . . . . 88
Perceptual Judgements. . . . . . . . 89
Author's judgements of cry features. . . . 89
Panel's judgements of roughness and strain . . 89

IV DISCUSSION . . . . . . . . . . . 90
Basic Hypotheses . . . . . . . . . 90
Fundamental Frequency Production . . . . . 90
Jitter . . . . . . . . . . . 94
Formant Frequencies. . . . . . . . . 95
Long Term Spectra. . . . . . . . . 98
Intensity. . . . . . . . . . . 98
Latency. . . . . . . . . . . 99
Duration . . . . . . . . . . 101
Other Issues . . . . . . . . . . 101
Phonation. . . . . . . . . . 102
Formant frequencies. . . . . . . . 104
Peak intensity level and duration. . . . . 104
The Physiology of the Pain Cry . . . . . 106
Frequency. . . . . . . . . . 106
Noise. . . . . . . . . . . 106
A Model of Cry Production. . . . . . . 108
The Use of the Pain Stimulus--Implications . . 109

v










Cry and Pathology--A Perspective . . . . . 111
CONCLUSIONS. . . . . 112 APPENDICES

A COMPARISONS OF THE CRIES OF NORMAL AND PATHOLOGICALLY
INVOLVED INFANTS. . . . . 115

B THE PHYSIOLOGY OF CRY AND ACOUSTIC/TEMPORAL CORRELATES 117

C LETTER TO LOCAL PEDIATRICIANS. . . . . . 120

D LETTER TO PARENTS OF PROSPECTIVE SUBJECTS. . 121 E THE PONDERAL INDEX. . . . . 123

F THE BRAZELTON NEONATAL BEHAVIORAL ASSESSMENT SCALE . 124
Curriculum Vitae--Susan Armstrong, Ph.D. . . . 127

G INFORMED CONSENT FORM. . . 129 H TAPE RECORDER CALIBRATION FOR INTENSITY MEASUREMENTS . 132 I SOMATIC DATA FOR INDIVIDUAL INFANTS. . . . . 133 J BNBAS (DEVELOPMENTAL) DATA FOR INDIVIDUAL INFANTS. . 134 K LONG TERM POWER SPECTRAL VALUES FROM INDIVIDUAL CRIES. 138 L PERCEPTUAL JUDGEMENTS OF CRY. . 145 REFERENCES. . . . . 143 BIOGRAPICAL SKETCH. . . . . 157

























vi
















LIST OF TABLES

TABLE

1-1 Studies of fundamental frequency of noriaal neonatal cry. . . . . . . . . . . . 15

1-2 Studies of formant frequencies of normal neonatal cry. 23 1-3 Studies of intensity of normal neonatal cry. . . 27 1-4 Studies of latency of normal neonatal cry. . . . 30 1-5 Studies of duration of normal neonatal cry . . . 32 3-1 Fundamental frequency values for first phonation (Pl) and second phonation (P2) cries. . . . 58 3-2 Corroboration of FFI fundamental frequency (FO) values . . . . . . . . . . . 60

3-3 Summary table of frequency parameters. . . 62 3-4 Results of paired difference t-tests between selected frequency parameters of phonation 1 and phonaton 2 64 3-5 Formant frequency values for each cry sample . . 67 3-6 Results of paired difference t-tests between formant frequency values of phonation 1 and phonation 2. . 70 3-7 Averaged long term spectral data for all cry samples 72 3-8 Peak absolute intensity values (in dB SPL) for each cry phonation. . . . . . . 78

3-9 Timing data for individual subjects. . . . . 79

3-10 Perceptual responses of five trained listeners to each cry. . . . . . . . . . . . . 85

4-1 Comparison of fundamental frequency values for normal pain cry in previous studies and in this study . . 91 4-2 Comparison of mean formant frequency values obtained in this study with those from previous studies. . . 97 I-1 Somatic data for individual subjects . . . . 133


vii










1-2 Brazelton scores for individual subjects . . . 134 J-1 Long term spectral for each of the cry samples . . 138 J-2 The author's perceptual judgments of the cries
relative to four parameters of "vocal quality. . 145 J-3 Individual and mean perceptual responses to "strain"
on each of the cry samples. . . . 146 J-4 Individual and mean perceptual responses to "roughness"
on each of the cry samples. . . . . 147




















































viii















LIST OF FIGURES


Figure

1-1 Schematic of the physiological systems underlying human sound production. . . 4 2-1 Schematic of the neonatal pain cry. . . 51 3-1 Scatterplot of mean fundamental frequency values (in khz) for each cry sample . . . . . . . . 61

3-2 Averaged long term spectral curve for all phonations. . 73 3-3 Comparison of the averaged long term spectral curves for the first and second phonation. . 74 3-4 Averaged long term spectral curve for first phonation . 75 3-5 Averaged long term spectral curve for second phonation. 75

































ix















Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


ACOUSTIC/TEMPORAL CHARACTERISTICS OF THE PAIN CRIES OF NORMAL NEONATES By

Brian Ross Klepper

May 1985


Chairman: Harry Hollien, Ph.D.
Major Department: Speech

The cry of the newborn infant is a complex signal which, through its components, reflects a host of underlying physiological processes. In order for the cry to be meaningfully studied, it first is necessary to describe it operationally. Earlier studies have yielded data for a broad range of parameters. However, few studies have been carried out which have applied even a high proportion of these analyses to a single data base. Consequently, acoustic and temporal (A/T) values were obtained for the cries of normal neonates in order to enhance present understanding and use of the cry.

Pain cries were recorded from 31 infants who were 26 to 30 days of age, had been full term, had uneventful medical histories, and who were shown to be within normal limits developmentally (using the Brazelton Neonatal Behavioral Assessment Scale) and somatically (using the Ponderal Index). Six A/T analyses were applied to determine values for fundamental frequency, jitter, formant frequency, long term


x










spectra, intensity, latency, and duration. In addition, a panel of trained listeners rated each cry for levels of "strain" and "roughness."

While the preponderance of cries were low frequency (modal) in nature (around 400-530 Hz), a few normal cries were also high frequency (hyperphonational). High frequency phonation most often occurred immediately after stimulus application, and only for certain infants. This suggests that the generalized muscular contractions and heightened laryngeal tensions produced by the stimuis are reflected through cry frequency registration, and may provide an index of the infant's vulnerability to stress. Furthermore, the ability to recover to a "modal" cry may provide an index of the infant's ability to adapt to the stimulus and achieve homeostasis. In addition, latency values were found to be stable independent of the manner of cry response, indicating that this parameter may be a useful reflection of neuological constancy in normals. Finally, the extreme characteristics of first phonations relative to second (across most parameters) suggest that while early portions of induced cry episode are stimulus-bound reflecting characteristics of the response, later phonation evolve into a "basic cry mode," and probably are stimulus-independant.
















xl
















CHAPTER I
INTRODUCTION


Perspectives

The cry of the neonate is an expression to the outside world of dissatisfaction with an external or internal situation. Crying is a holistic, all encompassing state, which not only communicates a response or need, but also reflects underlying physiology. The baby vocalizes, tenses the muscles throughout his body (Bosma et al., 1966) and undergoes a range of systemic physiological changes (Anderson, 1984). Thought to be principally mediated by the periaqueductal region of the midbrain (Ploog, 1981), and under partial control of the autonomic nervous system (Lenneberg, 1967; Lieberman et al., 1971; Lester and Zeskind, 1982), the newborn cry is characterized by rhythmic sets of motorically complex reflexive acts that respond to environmental stimuli encountered at birth (Osgood, 1953).

In addition, the cry is an acoustic and temporal (A/T) event which, independently of intention or meaning, reflects specific physiological processes. The nervous system (CNS) provides central control, the respiratory system powers the cry, the larynx operates to produce a source tone and the supra-glottal tract modulates the resulting signal into a final acoustic event (Fant, 1970).

Over the last half century, investigators have examined the perceptual and A/T characteristics of infant cry; research on the


1









2

relationship between the cry characteristics of normal and pathologically involved babies has formed the nucleus of this thrust (See Appendix A). In addition, some work on the cry characteristics of different classes of normal infants has been carried out. Many significant contributions have been made by investigators in this area. However, as the discussion below will demonstrate, there have been problems at times with the fundamental assumptions and procedures utilized in these studies. Consequently, new protocols are needed so that the range of A/T cry characteristics may be described for normal neonates.

Through the years, researchers have obtained different types of vocalizations from infants over a wide range of ages. Therefore, it is appropriate to define the ages and vocalizations used in this discussion. As defined by the neonatological and the developmental psychological literatures, a neonate is an infant between birth and four weeks of age (Brazelton, 1981; Stratton, 1982). Infancy has been defined as a period extending between birth and 104 weeks (Crystal, 1973), but is applied here as a general term. Since the focus of this project was limited to the cry of the neonate, the terms neonate,""newborn,""infant" or "baby" are used interchangeably.

In addition to crying, neonates exhibit a large number of vocal behaviors (whimpers, grunts, sneezes, sucking sounds, coughs, coos, etc.) which may properly be referred to as "vegetative" sounds (Hollien, 1980). While vegetative sounds may form a significant basis for later vocal behaviors (e.g., Stark et al., 1975; Nakazima, 1980) they are not as consistently patterned as are cries, nor as easily









3

elicited or controlled. Consequently, neonatal vegetative sounds will not be considered in this project.



The Physiological Dynamics of Crying

Crying is a synergistic event, controlled by sets of neural

commands which coordinate myriad programmed responses (Ploog, 1981). Like most sounds produced by humans, the vocal dimension of the neonate's cry results from the contributions of four physiological systems (See Figure 1-1). The nervous system provides central control for each of the other systems, synchronizing and coordinating their actions into a stylized form. The respiratory system is the power source for the sound production mechanism, creating pressure in the lungs which drives an airstream through the glottis. The larynx provides a mechanism for variable resistance to the air stream, resulting in the vibratory action of the vocal folds, and in turn, a source tone. Finally, the supra-glottal tract, comprised of several resonant chambers, acts to modify and "shape" the acoustic wave. A more detailed discussion of these four systems and assumptions as to their correlates in the cry may be found in Appendix B.

Although the systems cited above are essential to a vocal cry response, each has a fundamental biological function in the maintenance of the organism. Consequently, the physiological responses associated with survival may interact with and significantly affect the production of the vocal signal. Obviously, the nervous system does not exist simply to control the cry, but facilitates information processing and transfer for all activities of the individual. The respiratory system promotes oxygenation of and carbon





CENTRAL
NERVOUS
SYSTEM
central control





LARYNGEAL SUPRA-GLOTTAL
RESPIRATORY -3 .TRC
SYSTEM
(powEr ou)(source tone (source tone
(power source) .rd~in vouain
production) modulation)
Figure 1-1. Schematic of the physiological systems underlying human sound
production.









5

dioxide removal from the blood. The larynx is essentially a respiratory valve which prevents air from escaping the lungs and foreign substances from entering the airway. Finally, the supra-glottal tract establishes communication between the digestive and respiratory tracts and the exterior (Zemlin, 1981). In addition, there are a host of other systemic changes which accompany crying: bio-chemical (e.g., Woodson et al., 1981), cardiovascular (e.g., Vaughn and Sroufe, 1979; Anderson, 1978, 1984), respiratory (Karlberg, 1960; Long and Hull, 1961; Klaus et al., 1963; Vyas et al., 1981), state-transitional (Ruja, 1948; Thoman, 1975), and motoric (Bosma et al., 1966; Bosma, 1975). Thus, a spectrum of specific physiological activities is associated with crying, and the characteristics of each of these activities may interact with and affect the dynamics of cry vocalizations.

Anderson (1984) has argued that continuous or severe crying exerts potentially dangerous loads on the infant's physiological systems. For example, the generalized muscular tension which occurs during crying (Bosma et al., 1966) manifests locally in constriction of the laryngeal musculature (possibly for protection of the airway) and often results in periods of partial or complete glottal closure. The increase in airway resistance is complicated by increased contraction in the muscles of expiration. This juxtaposition of events constitutes a modified Valsalva maneuver. A radical increase in abdominal pressure occurs, raising the potential for abdominal hernia. Also, release from this strained posture may produce severe changes in blood pressure, which may result in recirculation of poorly









6

oxygenated blood as well as fluctuations in cerebral blood flow velocity, all circumstances condusive to intraventricular hemorrhage.

It is reasonable to speculate as well that the dynamics

associated with stress and the Valsalva maneuver (especially increased laryngeal muscle activity) result in specific vocal changes, including increases in frequency (from increased tension in the vocal folds), intensity level (from increased air flow during exhalation) and latency to the first phonation (from the interrupting influence of glottal closure). Several authors (Truby and Lind, 1966; Wolff, 1969; Lester and Zeskind, 1982) have noted that changes in muscle tension appear to correlate with changes in cry features through the course of the cry episode, and have speculated as to the different patterns of neuological innervation which underlie the motoric involvement of the different cry types. In their model of crying, Lester and Zeskind (1982) have hypothesized that a stressful stimulus initiates generalized and immediate inputs to the body by the sympathetic division (SNS) of the autonomic nervous system (ANS). In the larynx, this generalized input results in constriction of the intrinsic (and possibly the extrinsic) laryngeal musculature, with consequent lengthening and tensing of the vocal folds. Subsequently, the parasympathetic wing (PNS) of the ANS moderates the laryngeal tightening by providing discrete, inhibitory vagal input to the cricothyroid muscle, which is the principal muscle of laryngeal tension and frequency change. This model describes a servomechanism. in which the immediate biological response is directly accountable to the initiating stimulus, but which over time achieves homeostasis and consequently is not tied to the stimulus. The vocal aspect of that









7

response, the cry, follows the same pattern. That is, the onset of crying is likely to reflect characteristics of the stimulus (e.g., due to pain or hunger) while the basic or "modal" cry will be representative of the eventual return to "balance" within the system.



Evaluation of the Cry Signal

As described above, much research has been carried out on the

physiological correlates of crying. However, many investigations have utilized the acoustic cry signal, and have focused on characteristics of frequency (pitch), wave composition (quality), intensity (loudness) or timing. Studies of cry have used two principal approaches: 1) perceptual analyses and 2) A/T analyses. The first approach (perceptual analyses) involves listening to live or recorded cries in order to make judgements about them. The second (A/T analyses) utilizes various instrumentational analyses to examine the frequency, intensity or timing characteristics of the recorded cry event.



Perceptual Analyses of Cry

Historically, many important descriptions of infant cry have been obtained from simple and careful observation. In an early experiment in bioacoustics, William Gardiner (1838) provided musical descriptions of various sounds in nature, including the cries of a child. Darwin (1872) discussed crying in his volume on the emotions of animals and man, using line drawings and photographs to describe the facial expressions and movements that accompany the cry.

The invention of sound recording enabled listeners to repeatedly examine transient events. In 1906, Flatau and Gutzmann used this









8

process to record and study the cries of 30 infants. They estimated the pitch of the expiratory cry tones to be from G4 (392 Hz) to E7 (2637 Hz), a range of almost 3 octaves--and a reasonable approximation of range of fundamental frequency in infants as described in later analyses (See Table 1-1).

Information conveyed by the cry. A good deal of perceptual

cry research has centered on the kinds of information conveyed by the newborn's cry. For example, a number of investigators have used listener's judgements to examine the premise that the "quality" of a newborn cry is influenced by the cry stimulus (e.g., pain, hunger, startle). While Sherman (1927), Aldrich et al. (1945) and Muller et al. (1974) report little evidence to support this assumption, Wasz-Hockert et al. (1964), Wiesenfeld et al. (1981) and Sagi (1981) have provided data suggesting that individuals who have experience with infants (e.g., nurses, mothers) can distinguish between cries resulting from different stimuli better than less experienced subjects (fathers, non-mothers). In this regard, both Wolff (1969) and Bell and Ainsworth (1972) have argued that cries of differing acoustical qualities result in different responses from parents. However, as Hollien (1980) notes, the cry may simply be an alerting signal, and the caretaker may draw additional information about the stimulus from clues in the environment (time of day, infant history, etc.).

A number of experiments have been carried out to determine whether an individual infant's cries are so distinctive that the parent can differentiate that baby from other infant's cries. The data here appear to be consistent--mothers (and fathers) are generally








9

able to identify their own baby as early as three days after birth from the cry alone (Formby, 1967; Murry et al., 1975; Morsbach and Bunting, 1979; Wiesenfeld et al., 1981).

Perceptions of health state from the cry. Physicians

traditionally have assumed that the cries of healthy babies can be perceptually distinguished from those of unhealthy infants. Moreover, pediatricians have long been trained to listen to the quality of the cry as a clue to diagnosis (Parmalee, 1962). On this subject, Illingworth (1955) wrote,

.a clinician recognizes the hoarse, gruff cry of a
cretin, the hoarse cry of laryngitis, the shrill cry of
hydrocephalus, meningitis, or cerebral irritability, the
grunting cry of pneumonia, the feeble cry of amyotonia or of a
severely debilitated infant. (p. 76)

Currently, several neonatal assessment examinations (e.g., Brazelton, 1976; Dubowitz and Dubowitz, 1981) use the evaluator's judgements of the latency and quality of the newborn's cry to determine the relative development and stability of the infant (Vaughn and McKay, 1975). Several authors have pointed out the danger of lending such a subjective measure too much weight in the diagnostic process (e.g., Truby and Lind, 1965; Ostwald, 1972). In a short essay on crying and diagnostics, Parmalee (1962) warned, "Despite the commonness of these observations in medical practice, there have been very few systematic studies of crying (p. 801)."

Vuorenkoski et al. (1971) tested physicians and medical

students with recorded cries from normal and pathological infants, and reported that both groups did quite well in differentiating normal cries from abnormal. Further, more information about the acoustic (spectrographic) correlates of particular conditions facilitated a









10

rise in correct identification rates. But while the authors pointed to the success of their training sessions, they also cautioned that the "relatively good results obtained in this study do not correlate in reality with the practical possibilities in clinical conditions" They noted that the test cries were culled as the best examples of specific conditions from several recordings of the same babies. In addition, subjects were allowed to listen to recordings repeatedly in order to make their judgements, while competing noises in a clinical situation would make the process considerably more difficult.

The ability of listeners to discriminate between cries is not

limited only to normal/abnormal comparisons. It may extend as well to differences among infants who have been classified as normal. Zeskind and Lester (1978) utilized a sample of normal infants, but reclassified them as "high complication" or "low complication," based on scores of 1-2 or 5-9 complications respectively on Prechtl's (1977) scale of 42 optimal maternal, parturitional and fetal conditions. Low scores resulting from the absence of optimal conditions have been used as indicators of the degree of risk to the infant's nervous system (Prechtl, 1968). The experimenters presented the cries in a perceptual test to parents and non-parents. Subjects asked to rate the cries on an eight point polarized scale found the high-complication cries to be significantly more "negative" on all eight points than the low complication cries. Thus, this study demonstrated perceptible differences between "high- and low-risk normal" infants, and examined the relationship between the cry and an emotional response rendered from the listeners.








1.1

Several other workers also have explored the emotional reactions of listeners to cries of healthy or abnormal babies. Frodi et al. (1978) reported that parents found the cries of premature infants more aversive than those of term newborns. Interestingly, Freudenberg et al. (1978) found that normal newborns produced a cry which was more grating to adult listeners than the cries of Down syndrome infants. On this topic, Lester (1978) has speculated that abnormally sounding cries might profoundly affect the caregiver. He wrote:

. .the cry of the infant at risk and the damaged or
diseased infant may be unique in order to elicit special
caregiving that may facilitate survival. . [Conversely,] in a non-supportive environment, the behavior repertoire of the
poorly organized infant with a cry that is perceived as
grating and aversive may violate the limits of caregiver
control behavior and suppress the optimal caregiving pattern
necessary to facilitate the recovery of the baby. (p. 128)

Thus, characteristics of the cry carry information about underlying physiological processes, and the perceptual effects of those characteristics may have significant consequences in the relationship between the infant and his caregiver.

To summarize, the results of perceptual cry studies have

demonstrated that listeners can perceive differences both between the cries of normal and abnormal infants, and between the cries of different classes of normal newborns. These data 1) lend credence to the postulate that different physiological states may be reflected in characteristic cry patterns, and 2) suggest that infants can be misclassified as "normal" and might not obtain that rating if more stringent classification schemes were applied. Indeed, as Lester (1978) notes, use of traditional classification schemes increases the probability that cries of infants from certain "normal classes" (or from one end of the "range of normality") might be perceptually









12

confused with infants known to be abnormal. In addition, some evidence has been found to support the notion that different cry types elicit different caregiving behaviors. Such findings are not surprising, and suggest that humans have evolved an inherent notion of the nature of "appropriate" and "inappropriate" cry signals.

Finally, it should be noted that, almost by definition, the

resolution available in perceptual cry studies is severely limited. Listeners can identify differences between groups of cries, but it is difficult to ascertain which qualities of the cry provide the cues for differentiation. Thus, unless a viable taxonomy is developed to specify cry configurations for a full range of normal and pathological neonatal conditions, perceptual analysis of the cry may only serve as a gross indicator of status, and one that is easily open to misinterpretation.



Acoustic/Temporal Studies of Cry

Instead of relying on a perceptual "Gestalt" of the cry, many investigators have employed A/T analysis techniques to characterize the cry and its components. Workers in this area have generally utilized a common conceptual framework to define the source phenomena to which the components of the cry may be related. Even so, a broad range of procedures and parameters have been employed in feature extraction and a substantial corpus of data obtained.

A few investigators have devised complex multi-parametric systems of cry analysis. In an early scheme, the Wasz-Hockert group utilized a host of cry parameters which were linked to spectrographic patterns. Lester and Zeskind (1982) refined this spectrographic analysis scheme,









13

eliminating some parameters and dividing the rest into features of duration, harmonics and quality. However, the difficulties involved in obtaining accurate acoustic/temporal cry data have still proved to be formidable, and until recently most investigators concentrated on only one or two parameters. Even the most ambitious (e.g., Colton and Steinschneider, 1980) have examined fewer than ten.

The application of A/T analysis procedures to acoustic signals assumes that any signal may be defined (i.e., quantified) in terms of its frequency, intensity and temporal characteristics. Moreover, because each aspect of the acoustic wave represents the contribution(s) of physical phenomena, the signal components may be assumed to be reflective of specific source (i.e., physiological) events. However, it also has been shown that the human sound production mechanism is comprised of interacting systems. Consequently, the components of the cry signal also reflect the interactive dynamics of the sound production process (e.g., Golub, 1980).



Over the last 40 years, substantial effort has been focused on the measurement of certain acoustic and temporal patterns of the neonatal cry. While much of this research centered on establishing differences between the cries of normal and abnormal infants, some studies also have been carried out to define the acoustic/temporal characteristics of normal infants. Some of the major findings of those studies are reviewed below. Only studies of infants four weeks of age or less are included.









14

Fundamental frequency (FO). The A/T cry parameter studied

most thoroughly to date is fundamental frequency. Data for over 600 infants have been reported, and a mean FO value of slightly below 500 Hz may be distilled from the often disparate results of those studies. Table 1-1 presents the summarized data from investigations of FO in normal neonates.

A number of different tools have been used to extract fundamental frequency from the cry signal. Early studies of the FO of cry (Fairbanks, 1942; Michel, 1961) utilized phonophotography; each laryngeal cycle was measured using a photograph of an oscilloscopic trace of the cry signal; the results were quite accurate, although the data reduction procedure was lengthy by contemporary standards. An easier tool to work with and, to date, the principal analysis technique employed to obtain FO has been the time-frequency-amplitude (t-f-a) sound spectrograph. Although this device provides an adequate tool for estimating the timing of cries, it does not have adequate resolution to facilitate more than gross measurements of this important parameter. In the narrow band mode (filter bandwidth--45 Hz), measurements of the fundamental must be made by estimating the frequency of the nth harmonic overtone, and dividing by n. In the wide band mode (filter bandwidth--300 Hz), low frequency glottal pulses may be resolved enough for identification and counting, but pulses occurring with high frequency phonation tend to blur together, impeding accurate calculation. Thus, using spectrographic analysis it is difficult to determine the fundamental frequency characteristics of a cry vocalization for more than a small portion of the signal at a time, since the representation of phofnation is usually interspersed









15

Table 1-1 -- Studies of fundamental frequency of normal neonatal cry.


N Class. Age FO SD Range Stimulus
(days)

1.Flatau & 30 0-35 392-2637 spont.
Gutzmann,1906

2.Fairbanks,1942 1 m 0-30 373 153-888 hunger

3.Michel,1961 1 m 10 289 hunger
1 f 13 352 hunger

4.Ringel & 10 4m/6f 1-2 413 30.05 290-508 pain
Kluppel,1964

5.Sheppard & 1 m 1-9 438 *
Lane,1968 13-21 411 *
25-33 404 *
1 f 1-9 401 *
13-21 384 *
25-33 401 *

6.Wasz-Hockert 77 0-39 500 450-550 birth
et al.,1968 60 0-30 530 410-650 pain
75 0-30 470 390-550 hunger

7.Lieberman 1 f 0 400 birth
et al.,1971

8.Michelsson,1971 50 0-10 620 110 pain

9.Tenold 9 2 518 pain
et al.,1974

10.Ostwald,1974 5 400-600 *

ll.Prescott,1975 4 1-10 384 38 spont.

12.Lester,1976 12 flwt 2 308 32 pain
12 undwt 2 479 26 pain

13.Sirvio & 50 390-620 *
Michelsson,1976

14.Kittel 50 0 425 birth
& Hecht,1977









16

Table 1-1 -- continued.

N Class. Age FO SD Range Stimulus
(days)

15.Lester,1978 12 flwt 2 466 83 pain
12 undwt 2 740 176 pain

16.Zeskind & 12 lcomp 2 468 54 pain
Lester,1978 12 hcomp 2 814 263 pain

17.Lester & 40 2 606 302 pain
Zeskind,1978

18.Colton & Stein- 33 m 7 503 92 handling
schneider,1980 33 f 7 523 101 handling

19.Gardosik et 53 m 0 467 79 327-669 birth
al.,1980 50 f 0 458 65 356-664 birth

20.Zeskind,1981 13 ave-PI** 2 524 29 440-800 pain
13 low-PI 2 1140 603 410-2000 pain 13 hi-PI 2 991 576 410-2000 pain

21.Zeskind & 19 ave-PI 2 488 100 pain
Lester,1981 19 low-PI 2 665 227 pain
19 hi-PI 2 665 277 pain


Stimulus N X(FO)-Hz

Pain 236 536
Birth 231 467
Hunger 78 465
Spont. 70 506

All Cries 615 498


*Not presented
**Ponderal Index (see Appendix E)








17

between periods of noise or quiet. Calculations of the mean, standard deviation, mode or other parameters are extremely difficult to make. Indeed, Michelsson (1980) notes that the temporal instability of FO in the crying infant resulted in Wasz-Hockert et al.'s concentration on the specification of "maximum pitch,""minimum pitch" and "general pitch" (i.e., the "dominating pitch level in the cry"), rather than a mean value. Perhaps the most accurate determinations to date of neonatal FO have employed digital spectrum analyzers (Lester 1976; Zeskind and Lester 1978; Lester and Zeskind, 1978; Gardosik et al., 1980), which sample the cry signal at rates from 10,000-20,000 samples/second. Frequency by amplitude histograms are developed for each bandwidth in the analyzed portion of the spectrum and the first peak on the output histogram is "assumed to be the FO for that sample" (Gardosik et al., 1980). However, the accuracy of this method is compromised to some degree as well since 1) the resolution of the results are limited by the size of the bandwidths, and 2) histogram peaks do not represent the mean value of the sample, but rather the mode (i.e., the most frequently occurring value). While most analysis procedures have provided only an approximation of FO, precise calculation of cry FO has been obtained using the IASCP Fundamental Frequency Indicator (FFI) (Murry et al.,1975), although those data are not included in this report since the infants were 3-6 months of age. FFI is discussed in more detail in Chapter II.

Another difficulty with the FO cry values reported to date stems from the large variability in data among studies. For example, different investigators have elicited cries from different stimuli. (The breakdown of mean FO by stimulus is provided in Table 1-1.)









18



However, while some differences in FO are apparent (e.g., FO(pain)> FO(hunger)), the results overall are clouded by ambiguities as to the nature of the "spontaneous" and "birth" stimuli).

In addition, many investigators have used very small subject samples in their studies (e.g., Fairbanks, 1942; Michel, 1961; Sheppard and Lane, 1968; Ostwald, 1974). Not surprisingly, the variability among FO values obtained for these small-sample studies is quite large. Michel (1961) reports an FO value of 289 Hz (49.7 semitones (ST)) for a single ten day old male child, while Sheppard and Lane (1968) report a value of 438 Hz (56.9 ST), 52% higher than Michel's infant, for another male child of nearly the same age.

Similar patterns of FO variability are evident for larger samples of infants. For example, Michelsson (1971) reports data for 50 babies at 620 Hz (62.94 ST). By contrast, Wasz-Hockert et al. (1968) report that mean FO values for pain cries of 60 newborns is 530 Hz (60.22 ST) (17% lower than Michelsson's) and Gardosik et al. (1980) arrive at mean values of 462.5 Hz (57.86 ST) for the pain cries of 103 infants (25.5% below Michelsson's data). Such wide variability in the FO values of groups of normal infants seems excessive.

The variability of FO within groups of normal infants has been addressed in studies carried out by Lester (1976), Lester and Zeskind (1978), Zeskind and Lester (1978, 1981), and Zeskind (1981). These researchers have compared groups of clinically "normal" two day old infants who subsequently were reclassified as "low risk" or "at risk" using the Ponderal Index (PI) and the Brazelton. They hypothesize that since these tests provide information about conditions which are








19

indicative of CNS "stress," poor scores on the examinations may be correlated with abnormal cry characteristics (such as FO). Interestingly, significant differences between cry features for the "low risk" and "at risk" groups have been found consistently. The fundamental frequency of healthy infants (with higher PI scores and Brazelton performances) has been generally lower, while underweight infants (with low PI scores and Brazelton performances) have exhibited higher FOs. The data from the Lester and Zeskind studies lead to the conclusion that studies of "normal infants" which have not used such stringent subject selection criteria may have included both "low risk" and "at risk" babies in their samples. Consequently, the investigators in these studies may have obtained mean FO and standard deviation values that are skewed upward.

Finally, the classification by fundamental frequency production according to "cry modes" was introduced by the Wasz-Hockert group (cited in Michelsson et al., 1980) who noted that infants phonate in three distinct modes: "phonation," "hyperphonation" and "dysphonation." They hypothesized that while normal phonation is predominant, 1) tightening of the laryngeal musculature may result in short periods of high frequency production (hyperphonation) and 2) inadequate closure of the vocal folds may produce some turbulence in the signal (dysphonation) (see also Physiological Dynamics of the Cry, above). While the Wasz-Hockert group simply noted the absence or presence of this characteristic in their analysis, Golub (1980) furthered analysis of their parameter by using a "formant tracking" program and linear predictive correlation analysis. He reported that normal infants "normally" phonate during approximately 80% of the cry,









20

"hyperphonate during approxiatly 5% of the cry and "dysphonate" during approximately 15% of the cry.

In actuality, these classification modes refer to two different kinds of phenomena: 1) vocal registers during crying and 2) the presence or absence of turbulence, noise or aperiodicity during "normal phonation" (since noise does not appear to occur during "hyperphonation"). In addition, cry registration has not only been identified with respect to the frequency of a continuous cry, but in the rapid transition during one phonation from phonation to hyperphonation and back again. This phenomena is known in the phonetics literature as a "voice break," but is referred to in the cry literature simply as a register shift (e.g., Michelsson, 1980; Lester and Zeskind, 1982).

Hollien (1974) defines a vocal register "simply as a series or

range of consecutively phonated frequencies which can be produced with nearly identical vocal quality [and with] little or no overlap in fundamental frequency (FO) between adjacent registers (p. 126). He points out that while the idea of vocal registration is used loosely within several disciplines, the identification of a vocal register requires an operational definition 1) perceptually, 2) acoustically, 3) physiologically and 4) aerodynamically. It seems clear that while there is some evidence available (perceptual, acoustic) to substantiate the existence of vocal registers during crying, the physiological and aerodynamic data are insufficient at this juncture.

Still, the concept of registers provides a convenient vehicle for the discussion of similar qualities among cries, and consequently the terms appropriate to different registers (from Hollien, 1974) will be









21

applied throughout the following text. The type of cry which occurs most often will be referred to as "modal." High frequency crying, sometimes referred to in the literature as "flute" or "whistle" (Hollien, 1974) will be referred to as "hyperphonation," "falsetto," or "loft." Low frequency crying, called "glottal pulse" or "creak" in the literature, will be referred to as "pulse" or "fry."

Jitter. Vocal jitter is the cycle-to-cycle variation found

within successive periods of a laryngeal vibratory pattern; it is a commonly accepted characteristic of human vocalization. Although jitter has not been employed in cry studies thus far, it has been mentioned in the cry literature, and is potentially attractive as a useful measure. Normally,.it is assumed that the phonatory sample is approximately stable around a central frequency, since increases or decreases in frequency may tend to bias measurement. Moreover, the phonational samples generally are relatively free of aperiodicty, since the presence of noise, characterized by large (apparently random) frequency changes between adjacent cycles, may inflate jitter values.

Jitter has been found to occur in the sustained phonation of

young adults. Values of 0.5-1.0% appear to be typical, although the magnitude appears to be dependent upon age, frequency and intensity level of the particular sample and the method of measurement (Horii, 1979; Wilcox, 1978; Hollien et al., 1973). It has been suggested also that the magnitude of jitter is indicative of the general condition of the larynx (Hecker and Kreul, 1971; Murry and Doherty, 1980). Lieberman (1963) proposed that jitter may have utility in describing the stability of infants' laryngeal control. Similarly,









22

Bosma, Truby and Lind (1965) suggested that an infant's neural developmental status may be evaluated from factors such as the stability of laryngeal coordinations and the mobility of vocal tract components during crying.

To date, no studies of neonatal crying have examined jitter.

However, it is not unreasonable to assume that jitter levels might be significantly larger for neonates, since frequency production often is quite variable, and since a noise component (characterized by random cycle-to-cycle changes in frequency) often constitutes a substantial portion of cry phonation.

Formant frequencies. The specification of formant frequencies during crying has been the least investigated of the cry parameters (See Table 1-2). Most investigators interested in this area have measured the center frequencies of cry formants on t-f-a spectrograms, but Gardosik et al. crosschecked his obtained values through the use of a spectrum analyzer, verifying that spectral energy peaks existed at the expected frequencies. Lieberman et al. (1971) reported that the formant values they obtained were similar to those predicted from the vocal tract transfer function. The major difference between the observed and predicted frequencies occurs at the first formant, where the center frequency is much higher than expected. In general, the findings of each of the four studies carried out in this area are in agreement with one another.

However, while it is a relatively simple task to identify

spectral peaks, it is difficult to know how those frequencies relate to the production of sound during cry since 1) there is little good information about the anatomical characterstics of the infant









23

Table 1-2 -- Studies of formant frequencies of normal neonatal cry.


Study N Class Fl F2 F3

1. Ringel & 10 4m/6f (1400-1500)@ (1700-3200)@ *
Kluppel,1964

2. Lieberman 1 m 1100 3300
et al.,1971

3. Colton & 33 m 1592.5 3223.8 5337.2
Steinschneider, (396.8) (585.4) (863.6)
1980 33 f 1653.1 3274.6 5368.2
(322.5) (487.0) (785.8)

4. Gardosik 53 m 1573.2 3106.5 *
et al.,1980 (937-3281) (1719-4375) *
50 f 1527.0 3111.2 *
(900-2400) (1875-4375) *


*Not provided
@approximately









24

vocal tract and 2) models relating formant frequencies to adult anatomy may not apply to neonates. Lieberman et al. (1971) used data about the neonatal tongue (Hopkin, 1967) and the positioning of the neonatal larynx (Noback, 1923) relative to adult measurements (Chiba and Kajiyama, 1958), and estimated the newborn's vocal tract size to be 7.5 cm. Colton and Steinschneider (1981) provide no reference for their estimate of 8 cm. Furthermore, as Golub (1980) points out, there are obvious anatomical differences between adults and neonates (e.g., infants have tongues that nearly fill the oral cavity, and large fat pads in their cheeks) which result in different acoustically relevant dimensions such as 1) the ratio of pharynx length to mouth length or 2) the ratio of nasal tract length to vocal tract length. Cineradiographic data from Bosma, Truby and Lind (1965) indicate that during the crying episode, the supralaryngeal vocal tract is nearly stationary, with the tongue thrust down and forward in the open mouth.

Traditionally, the acoustical resonances for the human vocal tract have been modeled as a simple tube (uniform cross sectional area) which is closed at one end (Fant, 1960, Stevens and House, 1961). Thus, the resonant frequencies of the tube are given by the equation

(2k+1)C / 4L

where C = the velocity of sound (approximately 340 meters/second), L = the length of the tube (in cms.) and (2k+1) = the number of the formant in question. Thus, for a vocal tract size of 7.75 cm. (the approximate values used by Lieberman et al. (1971) and by Colton et









25

al., (1981), Fl-F3 values of approximately 1100, 3300 and 5500 would be predicted.

Data approximating the cited values were reported by Lieberman et al. (1971). Specifically, the cries they examined appeared consistent to this relationship; hence they argued that the rigidity of the infant supralaryngeal tract apparently was responsible for the constancy of relationships among the formant values. However, as noted above, while the F2 values reported across all studies have been consistent, the data for Fl generally have been higher than that found by Lieberman et al. Based on the Fl value generated in most studies (approximately 1525 Hz), vocal track length would be estimated to be 5.6 cm, or approximately one-third the vocal track length of an adult male (Zemlin, 1981). Colton and Steinschneider obtained F2 (but not Fl) values that agree with those of Lieberman et al. (approximately 3200 Hz); they appear to believe that these values do reflect the vocal tract resonances based on the uniform tube equation. In turn, they argue that a higher than expected degree of mouth opening is reflected in an elevated Fl value with minimal effect on the other formants (Fant, 1960).

Spectral distribution of energy level. Colton and

Steinschneider (1980) used fast fourier transform (FFT) techniques to specify the relative energy levels within three spectral bandwidths. At a microphone distance of 9 inches, they found 73 dB between

0.05-4.0 KHz, 64 dB between 4.0-8.0 KHz and 46 dB between 8.0 and 16.0 KHz. Given the relative ease with which these data can be generated, it is surprising that more research has not been carried out with this parameter. Spectral analyses of this type (but with higher









26

resolution) might be potentially useful to characterize the appropriate energy distributions for normal, healthy infants under a variety of conditions. For example, comparisons of spectral curves from different cry-types might indicate different energy levels in particular frequency intervals, which in turn might suggest the presence or absence of noise within the signal.

Intensity analyses of cry. The study of intensity constitutes one of the most challenging problems in bioacoustics, since in many instances, recording procedures are not designed with this parameter in mind. The studies summarized in Table 1-3 represent the few descriptions of the intensity characteristics of the normal newborn cry. A number of investigators have reported their impressions of the relative amplitudes of cry sounds. Ostwald claimed that normal cry sounds are generally 20 dB more intense than other vocal sounds (Ostwald, 1963). Stark and Nathanson (1975) noted that the cries of an infant who had died of Sudden Infant Death Syndrome (SIDS) were about 10 dB lower in intensity than the cries of normal infants. Truby and Lind (1965) noted that the cries of CNS damaged infants are characterized by a greater variability in their output levels. Each of these reports seems reasonable, since it might be expected that healthy infants cry with greater control, consistency and expiratory power than compromised infants.

Clearly, there has been a lack of quantitative data about the intensity of the normal cry. Most studies of cry have been carried out in clinical situations where environmental noise is difficult to control. Few studies have actually attempted to calibrate the recording systems other than by trying to maintain the microphone a








27

Table 1-3 -- Studies of intensity of normal neonatal cry.


Study N Age Class. X (dB) SD Mic.Dist.
(days) (inches)

1. Ringel & 10 1-2 4m/6f 82.13 3.40 12
Kluppel,1964

2. Sheppard & 1 1-9 m 0.32@ *
Lane,1968 13-21 0.29@ *
25-33 0.31@ *
1 1-9 f 0.30@ *
13-21 0.25@ *
25-33 0.34@ *

3. Zeskind & 12 2 lcomp. 42.75 9.24 16
Lester,1976 12 2 hcomp. 46.38 8.90 *

4. Colton & Stein- 33 7 m 74.34 6.79 9
schneider, 1980 33 7 f 73.72 7.36 9


*Not provided
@Values represent "coefficient of variability"









28

certain distance from the crying infant. Ringel and Kluppel (1964) suspended a microphone one foot above the child's head, and Colton and Steinschneider (1980) affixed a microphone to the side of the baby's crib, but in neither study were the movements of the child or the general noise level controlled other than by being in a "quiet room." Sheppard and Lane (1968) controlled for noise by recording within "plexiglass air cribs," but make no mention of microphone distance. In addition, they attempted to derive a meaningful measure of relative cry intensity by calculating an "average coefficient of variation within utterances in the amplitude measures" (p.

Temporal analyses of cry. The cry event may be thought of as

being comprised by several temporal markers--the stimulus application, the first expiratory burst, onset of first phonation, offset of first phonation, onset of second phonation--any of which may be used to designate the elapsed time for sub-events. Two of these sub-events, latency to the first phonation and duration of the first phonation, have been discussed repeatedly in the literature.

Latency. Latency is the elapsed time from the application of a cry stimulus (e.g., pain, startle) to the onset of the first expiratory phonation. It may be thought of as consisting of three events: 1) neurological processing time, 2) inhalation time and 3) the time required for expiratory adjustments appropriate to onset. Latency has traditionally included inhalation since it is often difficult to ascertain from the recorded audio signal the precise onset and duration of that event. Neurological processing time refers to the amount of time the organism needs to respond to the stimulus--it is assumed that the stressed or compromised individual









29

will respond less effectively (and consequently in a different time frame) than the healthy individual. Michelsson's (1971) data for 158 normal children over the span of their first year suggest that the latency remains stable during that time. However, she also makes the excellent point that the cry latency may vary slightly due to the alertness and state of the infant. The summarized latency data from seven cry studies are presented in Table 1-4. As may be seen, most of the reported mean latency values are consistent with one another, with the possible exception of the three "high" values from the Michelsson (1971) (1.8 seconds), and the Lester and Zeskind (1976, 1978) underweight and high complication populations. Again, the data from Lester and Zeskind are revealing. Their low complication and full weight infants exhibit latency values which agree with values reported by many other investigators. However, normal "at risk" infants in those studies exhibit values which are greater than those cited by others; a relationship which probably reflects their stressed CNS status.

The relative agreement between temporal values obtained between studies is probably partially due to the fact that latency measurements can made from calibrated time lines which tend to be quite accurate. Fisichelli and Karelitz (1963) used a graphic milliammeter operating at five seconds per inch to make their measurements. Other investigators have used spectrograms to make temporal measurements; timing is the most accurate dimension of the t-f-a display. One source of potential variability in temporal measurements comes from the fact that most studies have utilized a verbal signal (e.g., "Now") to identify the application of the cry









30

Table 1-4 -- Studies of latency of normal neonatal cry.


Study N Age Class X SD Range
sec sec

1. Fisichelli & 17 1-30 1.54 0.66 0-2.0
Karelitz,1963

2. Micheisson, 50 1-10 1.80 1.2-2.5
1971

3. Caldwell & 26 2 m 1.46 0.27 *
Leeper,1974 3 1.41 0.30 *
4 1.67 0.94 *
2 f 1.45 0.40 *
3 1.50 0.51 *
4 1.55 0.54 *

4. Lester,1976 12 2 flwt. 1.47 0.26 *
12 2 unwt. 1.80 0.69 *

5. Lester,1978 12 2 flwt. 1.56 0.46 *
12 2 unwt. 1.72 0.29 *

6. Zeskind & 24 2 lcomp. 1.37 0.62 *
Lester,1978 24 2 hcomp. 2.08 1.10 *

7. Thoden & 38 1 1.40 0.80 0.4-3.6
Koivisto,1980 5 1.40 0.70 0.4-3.2

8. Zeskind & 19 2 ave-PI 1.20 0.30 *
Lester,1981 19 2 low-PI 1.76 0.66 *
19 2 hi-PI 1.66 0.90 *


*Not provided








31

stimulus. If latency is to serve as an indicator of sensorineural and motoric processing time, differences of milliseconds might be significant. Thus, the resolution of the latency metric may be degraded by the variability inherent in using a vocal correlate of the stimulus.

Duration. Duration, or cry-time, refers to the expirational

phonation time, often only the first cry, but sometimes until the end of a specified phonation in the nth cry cycle. Duration represents a temporal measure, and thus is reasonably easy to accurately determine--at least prior to the lower amplitude portions of the decaying signal. However, as may be seen in the data presented in Table 1-5, there is variability in the obtained values: 1.2 seconds for the first cry of 33 newborns (Colton and Steinschneider, 1980) to

6.45 seconds for the first cry of 24 low complication newborns (Zeskind and Lester, 1978). This difference between these sample values is nearly 540%! Some of these discrepancies are undoubtedly due to the lack of agreement among researchers as to the boundaries of the event to be measured. For example, Thoden and Koivisto (1980) measured the first phonation between the first two inspirations. However, Michelsson (1980) indicates that the continuity of phonation is often broken, and notes that she would only accept for measurement cries of more than 0.4 seconds. Thus, the values from previous studies are often highly variable, but in disagreement. For example, Lester (1976) presents data indicating that low complication infants have shorter cries than high complication babies, while the same author (1978) also has provided data suggesting that full weight newborns cry longer than underweight ones. The data from the former









32

Table 1-5 -- Studies of duration of normal neonatal cry.


Study N Age Class X Dur. SD Ind. Ind. Range
(days) (sec.) Dur. SD

1. Ringel & 10 1-2 4m/6f 1.5 0.62 0.6-4.0
Kluppel,1964

2. Sheppard & 1 1-9 m 0.6 *
Lane,1968 13-21 0.5 *
25-33 0.5 *
1 1-9 f 0.1 *
13-21 0.4 *
25-33 0.6 *

3. Wolff, 1969 1 3 4.1 *

4. Michelsson,1971 50 1-10 2.0 1.6-3.6

5. Caldwell & 26 1 m 0.61 0.14 *
Leeper,1974 2 0.60 0.20 *
3 0.68 0.22 *
1 f 0.67 0.34 *
2 0.68 0.19 *
3 0.62 0.24 *

6. Prescott,1975 4 1-10 1.2 0.55 *

7. Lester,1976 12 2 flwt. 1.5 0.39 *
12 2 unwt. 2.7 1.15 *

8. Lester,1978 12 2 flwt. 1.3 0.54 *
12 2 unwt. 4.8 0.38 *

9. Lester & 40 2 2.5 2.23 *
Zeskind,1978

10.Zeskind & 24 2 lcomp. 6.5 3.62 *
Lester,1978 24 2 hcomp. 3.8 1.79 *

11.Colton & Stein- 33 7 m 1.2 0.39 *
schneider,1980 33 7 f 1.2 0.31 *

12.Thoden & 38 1 5.0 2.80 *
Koivisto,1980 38 5 5.2 2.30 *


*Not provided









33

study are particularly difficult to reconcile in light of all other information presented by those authors. Michelsson (1980) warns that since the intensity of the phonational signal trails off slowly, it may be necessary to boost the output gain in order to identify the end of the cry. Moreover, if input level controls are calibrated to avoid clipping the most intense portion of the cry signal, the least intense portions of the signal (at the end of phonation, for example) may be lost. Even if the low amplitude parts of the signal are captured on the recording, the limitations of the analysis procedue may not be adequate to facilitate detection of the quiet end of the phonational signal. In any case, some of the values reported for the duration of the first expirational phonation are too extreme to be accepted without question. In all likelihood, differences in definitions of the event, measurement procedures, and instrumentational difficulties may account for the discrepancies.



Summary of Previous Studies

In general, the cited studies have reported a broad range of results for specific parameters. The sources of variation may be attributed to the following issues.

1) Researchers have often disagreed as to a) which acoustic or

temporal cry features are important to study, b) what the

definitions of a specific parameter should be and c) the best

measurement procedure to apply to the task. Consequently,

although a good deal of normative data has been generated for certain parameters, integration of data from different studies

frequently has been difficult. Thus, an understanding of the cry









34

event must be predicated on appropriate and accepted

physiological and acoustic constructs.

2) Until recently, most investigations of cry have tended to only

define normality in terms of its difference from abnormality, and

relatively few investigators have sought to establish

acoustic/temporal baseline data for normal infants. Further,

recent indications are that most studies which have been focused

on normal babies have not applied stringent enough criteria for

normality to subject selection. Consequently, the data currently

presently available for most neonatal cry parameters should be

reconsidered in light of such realizations. Moreover, while some differences have have been identified between the cries of normal

and abnormal newborns, differences are also clear between the

cries of different groups of normal infants. The introduction of

developmental and somatic criteria into this type of research

(e.g., Zeskind and Lester, 1978) provided additional dimensions which could be used to establish bounds to which specific data could be applied. Therefore, it was appropriate to attempt to

establish a range of acoustical and temporal cry values for

infants who have been clearly defined as normal and low risk.



Specific Aims

The cry of the newborn infant is a complex signal which, through its components, reflects a host of underlying physiological processes. If the cry is to be meaningfully studied in all its diverse manifestations, it first is necessary to describe it operationally. Clearly, such a process is fundamental to the understanding of the cry









35

as a communicative act--as well as to the development of cry-based diagnostic systems which seek to discriminate those cries which deviate from the normal, or evaluate cry characteristics for evidence of underlying physiological processes. While a composite review of the cry literature yields data for a broad range of parameters, few studies have been carried out which have applied even a high proportion of these analyses to a single data base. Finally, several types of analysis are now available which previously could not been utilized in cry research; they have the potential to provide excellent information about the cry signal. Consequently, an attempt was made to obtain acoustic and temporal values for neonatal cries in order to enhance the present understanding of the nature and use of the cry signal. There were three major goals:

1) Incorporation and application of the range of cry parameters used

in previous studies to a single subject sample. It was expected

that this research would provide normal data for an extensive

range of acoustic/temporal parameters.

2) Clarification of existing relationships based on cry data

obtained from previous studies. The specific areas included a)

fundamental frequency (including range and mode), b) spectral

(formant) peaks, c) peak intensity level, d) latency and e)

duration.

3) Broadening of the acoustic/temporal cry data base by the introduction of appropriate measures previously unused in cry

research. They included a) jitter and b) spectral distribution

(i.e, long-term power spectra).








36

Within the context of these goals, a number of specific

hypotheses were tested. Some of these hypotheses are based on previously obtained data, and some are based on subjective observations. They are as follows:

1) Fundamental frequency (FO) values for cries of medically and

developmentally normal infants would be within the lower range

(approximately 350-500 Hz) of values previously reported (See

Table 1-1).

2) Since infants may be expected to exhibit less neuromuscular

control than do older individuals, jitter values for neonates

will be higher than those reported in the literature for adults.

Since no data on jitter presently exists in the infant literature, this information would be new in this area.

3) The reported values for normal crying formant frequencies (See Table 1-2) are correct. That is, major spectral peaks (for cry) would be located at approximately 1400-1600 Hz for Fl, 3100-3300

Hz for F2, and 5300-5400 Hz for F3, which relate in turn to vocal

tract size in the neonate.

4) Maximum absolute intensity level values (re: SPL) during crying would be higher than those previously reported (See Table 1-3).

That is, maximum intensity levels at a distance of 24" from the

microphone were expected to be approximately 70-85 dB SPL.

5) Obtained latency values would be consistent with those values reported previously in the literature (See Table 1-4).

Specifically, it was expected that the latency between stimulus

application and the onset of the cry would be approximately

1.4-1.6 seconds. This value should reflect the time necessary









37

for the healthy infant to process the stimulus and begin the

response.

6) Based on the values reported from previous studies (See Table

1-5), it was expected that the cry durations (for each expiratory

phonation) would be highly variable (mean 0.5-5.0 seconds with a

standard deviation of 2.5-3.5 seconds). Furthermore, it was

hypothesized that the duration of the first cry would be longer

than subsequent cries--and that first cry duration would be

positively correlated with cry latency. Finally, it was

hypothesized (from subjective observations) that cry reponses

could be sorted into two categories: one in which the cry began

with a series of short "cough-like" bursts, and a second in which

the cry was initiated with phonation.

7) Based on the hypotheses of Wolff (1969), Truby and Lind (1965),

and Lester and Zeskind (1982), it was speculated that the first

phonations would be different that those of second phonations.

Specifically, it was expected that fundamental frequency values,

jitter values, formant frequencies, phonation durations, and peak

intensity levels would all be different for the first than for the second cry. In the perceptual judgements, it was expected

that the panels would rate the first cries as generally more

strained and harsh than the second. Furthermore, the first cry

was expected to have characteristics associated with the pain

stimulus, while the second cry was expected to be more moderate,

progressing toward a "basic" cry.















CHAPTER II
METHOD


Overview

The methods that were used to obtain and analyze the pain cries of a sample of normal, healthy newborn infants will be described below. Specifically, a mild pain stimulus (pressure from a cuticle stick attached to an autolet) associated with the administration of the Brazelton Neonatal Behavioral Assessment Scale (BNBAS) was used to elicit cries from 31 normal healthy infants between the ages of 26 and 30 days--and these cries were recorded for later analysis. That is, six frequency, intensity and durational analyses were carried out in an attempt to define the ranges and characteristics of normal neonatal cry. These data provided an "acoustic profile" for each infant and for the sample. In addition, each cry was judged for "strain" and "harshness" by a panel of trained listeners. Three tests were employed to insure that each infant was physically and behaviorally within normal limits: a medical history (interview with parent), an anthropometric measure and a standardized behavioral assessment test.



Subject Selection

This project represented an attempt to establish stable data for the pain cries for a homogeneous sample of normal, healthy infants at the end of the first month life. Consequently, rather specific subject selection criteria were needed. For inclusion in the sample, each infant was required to be full term (38-42 weeks gestational age) 38









39

and 26-30 days postnatal age. In addition, three criteria were used to insure that each neonate was medically, physically and developmentally within normal or "low-risk" limits. First, the parent was interviewed to determine whether the infant's history included any pre-, peri- or post-natal medical complications. Second, the Ponderal Index (PI), a ratio of body weight to length, was calculated to determine if the child had appropriate fetal and postnatal growth. Finally, developmental and behavioral status was determined by administration of the Brazelton Neonatal Behavioral Assessment Scale (Brazelton, 1976). This test will be referred to subsequently as "The BNBAS" or "The Brazelton." Each of these procedures are discussed below in greater detail.

Officials of the Alachua County (Florida) Board of Health

provided the experimenters with photostatic copies of all certificates of birth filed and recorded in the Alachua County area. The experimenter then attempted to contact the parents of each newborn by telephone. If the parent could be reached, the details of the procedures were explained and the child's participation requested.

Additional subjects were obtained by referral from Gainesville, Florida, pediatricians. In this case, pediatricians were contacted first by letter (See Appendix C) and then by phone and briefed about the general nature of this project. Those physicians who agreed to cooperate referred appropriate patients to the investigators for testing and recording. Parents of prospective subjects were provided with a letter detailing the interests and procedures of this project (See Appendix D).









40

Appendices I and J contain the age, somatic and Brazelton data for each infant. In all, 31 newborns (22 males and 9 females) were included in the sample. Each infant was of full term gestational age at delivery, and was between 26-30 days (7=28.4, SD=1.0) postnatal age. Birthweights for the babies ranged from 2807 to 4877 grams (R=3645, SD=466), and birth lengths were from 48.3 to 55.9 cms (7=52.8, SD=2.1). All infants had normal and uneventful medical histories, as reported by their parents. Finally, all infants exhibited appropriate weight for length, indicating proper pre- and postnatal nutrition (See Appendix I) and were developmentally within normal limits according to their scores on the BNBAS (See Appendix J).



Procedure

Determinations of Normality

The parent interview

When parents were contacted concerning the Brazelton test, they were told that the research would concern "normal, healthy" babies. Hence, they were asked whether complications had been noted in the child's pre-, peri- or post-natal history. If the parent agreed to have the baby participate in the study, she/he was asked to bring birth information to the experimental session. At that time, the child's birth history (weight, length, head circumference, Apgar scores, length of labor, birth order, anesthesia given mother during labor, and abnormalities of labor) and post-natal history (current statistics, physician's name, findings of post-hospital medical examinations) were reviewed. Indications of medical problems resulted in rejection from the subject pool (although in all cases the








41

Brazelton was still administered). All parents were asked to sign an informed consent form (See Appendix G) permitting the investigators to contact the physician for additional medical information. Appropriate fetal growth

While a number of earlier cry studies have assessed normality on the basis of birthweight alone (e.g., normal infants 2500 grams), Rohrer's Ponderal Index (PI) provides a means for determining the appropriateness of a newborn's body weight for his length (Lubchenco et al., 1966; Miller and Hassanein, 1971)) (See Appendix E for a more detailed discussion of the PI.). The PI may be calculated as follows:

PI = weight (g) 100 / crown-heel length (cm) ** 3.

Statistics provided during the interview were used to calculate PI scores for all potential subjects. Since children falling outside the (3-97 percentile) range have been shown to be at higher risk for mordibity (Miller and Hassanein, 1971), only those infants with birthweight and PI's within that percentile range were accepted as subjects. Thus, acceptable ranges for boys were 5.8-10.1 lbs. and 18.2-21.5 inches, while ranges for girls were 5.8-9.4 lbs. and 18.5-21.1 inches (Stuart and Meredith, 1980). Accordingly, PI ratios of 2.21-2.81 were considered indicative of normal fetal growth (Miller and Hassanein, 1971), and only infants falling within that range were included for participation in the study. It might be noted that the same range of values was used also by Zeskind and Lester (1981) to define a "fullweight" neonatal sample for cry analysis.









42

Developmental status

All potential subjects were tested for normality by means of the BNBAS. (See Appendix F for a more detailed description.) Since the administration of the Brazelton includes a pain cry elicited from a gentle poke, those cries were recorded for later analysis. Thus, the Brazelton was used in this study 1) to determine that the infant was behaviorally within normal limits and 2) to elicit a standardized cry which could be recorded for later analysis.

The pain stimulus of the Brazelton may be administered only if the infant is in a resting or sleep state. Consequently, infants who were alert during the entire test session could not be included in the subject sample. In addition, infants who did not respond to the pain stimulus with a full cry (state 6 on the consciousness scale utilized in the Brazelton) could not be included in the sample.

A total of 48 infants were administered the BNBAS. Besides the 31 who were included in the subject sample, another 17 infants were disqualified on the following bases: 1) two infants were not judged to be developmentally within normal limits, 2) five were too alert throughout the test to permit the administration of the pain stimulus, 3) five gave no vocal response at all to the stimulus, and 4) the final five infants produced responses that were too "weak" to be considered cries.

The BNBAS is designed to be administered durng the newborn period (i.e., up to 30 days). However, it should be delayed until the third day of life to allow the infant to stabilize; i.e., to recover from the stresses of delivery and/or the effects of medications used in the birthing process (Als et al., 1977). This factor presented a








43

problem for data collection, since local hospitals and birthing centers discharge all apparently normal newborns prior to the third day. The hesitancy of new parents to participate in scientific studies external to the hospital environment is understandable. Consequently, an age range from 26 to 30 days was selected as the time frame for this research.

Only trained and certified evaluators may administer the Brazelton. Since the author does not have this training, the cooperation of a qualified individual (Dr. Susan Armstrong--see Appendix F) was enlisted to perform the test and to interpret the results for the parents.



Collection of Cry Samples

Elicitation of the pain cry

The administration of the Brazelton involves the elicitation of several different responses (e.g., pain, startle, defensive), any of which may include a cry from the neonate. Although the entire test was recorded, the pain cry was the only response of interest, and the only one subjected to analysis.

Each cry episode was initiated by an audible click (from the

autolet) which signified the onset of the cry stimulus. The sample of interest was completed at the end of the second cry cycle (i.e., the second expiratory phonation).

Recording procedures

The subjects were recorded in a sound treated booth (e.g.,

Industrial Acoustics Company, No. 1204.A). They were examined on a padded table, with a mark indicated on the sheet for head positioning









44

during the application of the pain stimulus. A Quest 1.125" inch PZT ceramic omnidirectional microphone was stabilized on a boom 24" above the surface of the bed. The control of head position and microphone distance permitted the measurement of absolute intensity level.

All acoustic events associated with the Brazelton test were recorded at 7.5 inches/second (ips) on a laboratory quality tape recorder (Teac A-60) using 1.5 mil magnetic tape. A high quality cassette recorder (JVC DD-9) was employed as a backup. A calibration tone (1000 Hz at a 110 dB) from a Quest C-12 piston phone calibrator was recorded (at OdB Vu) on the experimental tapes before and after each experimental session. Tape recorder calibration procedures are outlined in Appendix H. Master tapes from the taping sessions were stored in a cool, dry area. High quality copies were made of the sections containing the cries of interest; all analyses were carried out on the copies.



Analysis Procedures

Acoustic and temporal analyses

Each pain cry sample was analyzed using six procedures. The following natural cry features were examined: 1) crying fundamental frequency, 2) jitter (JIT), 3) formant frequencies (Fl-F5), 4) long term spectral distribution (LTS), 5) peak intensity level and 6) timing (i.e., latency and duration). Descriptive statistics were generated for each parameter within each analysis vector.

Crying fundamental frequency. Fundamental frequency values

were determined as follows. The primary analysis was carried out on the IASCP Fundamental Frequency Indicator (FFI). Corroboration of the








45

mean FO values was determined from analysis of the harmonic structure of narrow band time-frequency-amplitude (t-f-a) spectrograms of the cries and from the lowest modal peaks of Fast Fourier Transform (FFT) power spectra generated for the cries. Each of these techniques is outlined below.

The Fundamental Frequency Indicator (FFI). FFI consists of a

series of lowpass filters, with cutoffs at half-octave intervals, coupled to high-speed switching circuits which are controlled by

a logic system (Hoilien, 1981). FFI measures each wave by

producing a string of pulses--each pulse marking a boundary of

the fundamental period from the complex cry waves--which are

delivered to the computer. The number of clicks generated

between pulses by an electronic clock (10OK/sec) is resolved into the time period for each fundamental cycle. The speech signal is

represented as a series of cycle duration interval times:

c(1),c(2),.,c(n) where n is the number of cycles detected in

the sample being studied. A standard histogram is constructed to

summarize the frequency of occurrence of each length of cycle

value. The cycle/period information c(i) is transformed to

frequency f(i) in Hz where f(i)=1/c(i). The histogram of

frequency/period information was used to compute cry parameters

such as mean value (in Hz and semitones (ST)), the standard

deviation (in ST), the modal frequency (in Hz and ST), the

highest and lowest frequencies (in Hz and ST) detected in the

sample, and the range (in ST). Finally, since the upper limit of

FFI (800 Hz) is inadequate to deal with the range of FO values

that may be produced by a crying infant, it was necessary to play









46

each recording at half speed through the system and then double

the output.

Spectrographic Analysis (t-f-a). Narrow band spectrograms have

been the most widely used method of estimating crying fundamental

frequency in the past. Although the spectrogram provides

adequate temporal representation of the signal, resolution of

frequency is poor by contemporary standards, and there is no way

to accurately determine either the mean or the mode of FO. As

mentioned above, calculation of FO is accomplished by choosing a

point (in time) for measurement, counting the nth harmonic

overtone, using a template to estimate the frequency of that

overtone, and dividing by n. In this study, a Voice

Identification Model 700 spectrograph was employed to estimate

the mean FO of the cries in question.

Fast Fourier Transforml Analysis (FFT). This analysis procedure provides information on the relative amount of energy within 40

contiguous one-sixth octave bandwidths. A power spectral curve is

developed which displays the distribution of energy (over time) for the signal in question. Modal FO may be identified as the

lowest peak in the curve. A Princeton 4512 Fast Fourier

Transform Spectrum Analyzer was utilized for this procedure.

Vocal jitter (JIT). Since the period of each laryngeal cycle was obtained during extraction of fundamental frequency information, that same information was employed to calculate the temporal variability of adjacent periods in terms of mean jitter (in percent). The calculation of a sample's jitter may be expressed as









47



P1-P2 /(Pl+P2/2)
n

where PI and P2 are the periods of adjacent cycles within a phonational sample, and n is the total number of adjacent pairs examined. Thus, jitter is the difference in percent between the frequency of each pair of adjacent cycles, divided by the mean frequency of those two cycles.

However, measurement of this potentially important cry parameter is difficult for two reasons. First, since jitter is calculated from the changes in frequency between adjacent periods, continuous increases or decreases in frequency may bias the results. Consequently, most studies of jitter have used steady-state phonation for analysis. Obviously, it is not possible to control the vocal output of newborns, nor is it possible to always extract a steady-state portion of the cry for analysis. Second, aperiodicity within the cry signal may result in frequencies that both exceed the rules for frequency continuity (i.e., the cast limit) set up within FFI and/or restrict the data utilized in the calculation of jitter. Since FFI will only accept contiguous points from a noisy phonation, the data utilized in jitter calculations can only be as variable as the cast limit. Thus, use of the cast subroutine may result in a diminished representation of the true variability of crying phonation. Conversely, if the cast subroutine is not employed, the noise component introduced into the calculations would result in a jitter value perhaps more accurate relative to the true variability of the signal, but less related to actual phonation than would be desirable. In this study, the decison was made to use the normally utilized cast









48

limit of six semitones, which would still provide an indication of the variability within each cry, and would allow for comparison to data generated for adults.

In addition, the samples with the highest and lowest jitter

values were checked using another IASCP software approach known as the harmonics/noise (H/N) ratio. In this procedure, the signal is. digitized and a cursor is used to mark the boundaries of successive periods. The length of each marked period is then automatically calculated and the resultant jitter value computed.

In spite of the difficulties outlined above, a relative jitter value can provide direct quantification of the variability within crying phonation. Crying may be considered to represent a particular but extreme example of phonation, with variability that is dependent upon the nature of the stimulus, the infant's internal state and laryngeal dynamics. Because there is no cry jitter information preceding the data generated by this study, it should be considered exploratory in nature.

Formant frequencies (Fl-F5). As noted in Chapter I, a number of investigators have reported values for the first three frequency formants of neonatal cries. These regions of concentrated energy reflect the resonances imposed on the acoustic signal by the anatomical characteristics of the vocal tract. While most workers have used t-f-a spectrograms as a basis for analysis, some (e.g., Gardosik et al., 1980) have employed power spectra. The analysis of formant values undertaken in this study utilized both techniques. Determinations of up to five formant frequencies for each phonational sample were made from power spectra generated on the Princeton 4512








49

FFT spectrum analyzer, and corroboration of those values was made from spectrogrphic analysis (Voice Identification Model 700 spectrograph). In addition, mean formant frequencies (for all cries, for first and for second phonations) were obtained from the average power spectral curve generated for the long term spectral analysis.

Long term spectral analysis (LTS). Analysis of power spectra has been used relatively successfully in the development of speaker identification techniques (e.g., Doherty, 1975) and the evaluation of voice and speech disorders (e.g. Frokjaer-Jensen and Prytz, 1974; Wendler et al., 1980), as well as in cry research (Colton et al., in press). In contrast to the specification of formant frequencies, LTS provides information about the relative distribution of energy throughout the frequency spectrum during vocalization. In infants, it should provide information about energy emphasis within general bandwidths (i.e., broad changes in resonance) and the relative amounts of noise within a phonation (peak to spectral "floor" distance). However, since very little spectral data exists as yet, the information generated here should be considered tentative. Hopefully these data will be compared eventually to data developed for other normal infants as well as compromised samples. The system utilized at IASCP includes a Princeton 4512 FFT spectrum analyzer coupled to the PDP-11/23 computer. This vector uses 1/6 octave bands to generate a 33 parameter power spectrum curve covering the frequency range from 140 to 7,220 Hz.

Intensity. Four previous studies have reported peak intensity levels in the crying infant (Ringel and Kluppel, 1964; Sheppard and Lane, 1968; Zeskind and Lester, 1976; Colton and Steinschneider,








50

198Q). As noted in Chapter 1, the reported values have been widely disparate (42-82 dB SPL), and the procedures generally poorly described.

Recordings for this study were calibrated for intensity both

prior to and after each experimental session. Control of the infant's head positioning and of the microphone distance are outlined in the section on recording procedures (above). In addition, the tape recorder sensitivity was calibrated to OdB Vu (input and output) at 90 dB SPL. When later cries were analyzed, the Bruer & Kjael type 2305 graphic level recorder was in turn calibrated to the OdB Vu (90 dB SPL) output of the tape recorder. The resulting strip charts displayed the varying intensity level of the cry episode through time, so that peak intensity levels for each sample could be determined.

Timing. Nine parameters were examined in the course of the

temporal analysis in this study. (A schematic depiction of a cry event is presented in Figure 2-1). The measures used included

1) St-V1--This parameter was a measure of the elapsed time (in ms)

from the stimulus application to the first pre-phonational

vocalization (i.e., a cough, inspirational gasp or other short

burst). This latency period related to the time required for the

infant to process and respond to the stimulus.

2) #Vs St-Pl--This parameter was a simple count of the number of pre-phonatory bursts occurring between the stimulus application

and the first phonation.

3) St-Pl--The elapsed time (in ms) from the stimulus to the onset of the first phonation. This, like St-Vl, was also a latency period






















c Pl P2

ST
VI V2




a b d e g h
Figure 2-1. Schematic of the neonatal pain cry. Uppercase letters denote major events within
the cry. Lowercase letter define the segments within the cry.

Legend

ST--Stimulus Application ab--ST to VI onset de--Pl duration
Vl--First Non-phonatory Vocalizations e--# of Vis f--# of V2s
P1--First Phonation bd--V1; onset to P1 onset eg--Pi offset to P2 onset
V2--Second Non-phonatory Vocalizations ad- -ST to PI onset gh--P2 duration
P2--Second Phonation ab/ad--ST to VI onset
or Pl onset








52

(stimulus processing and response time) if the infant did not

respond first with a pre-phonational vocalization.

4) V1-P1--The elapsed time (in ms) from the first pre-phonatory

vocalization to the onset of the first phonation. This measure

represented a transition from the interruptive response of the

pre-phonation to the beginning of a release of air.

5) St-V1/Pl--The elapsed time (in ms) from the stimulus application

to either the first pre-phonatory vocalization or the first

phonation. This parameter combines the first (St-V1) and third

(St-P1) parameters to derive an accurate measure of the actual latency from stimulus application to the first vocal response.

6) Pl On-Off--Duration (in ms) of the first phonation, from onset to

offset. This measure describes the duration of the first vocal

expiratory release after the stimulus.

7) P1-P2--The elapsed time (in ms) from the offset of the first

phonation to the onset of the second phonation. This parameter

relates to the time required for respiratory recovery,

inhalation, and readjustment for the next phonation.

8) #Vs P1-P2--A simple count of the number of non-phonatory bursts

between the first and second phonation. This measure indicates

the proclivity for interruption within the second cry cycle.

9) P2 On-Off--Duration (in ms) of the first phonation, from onset to

offset. This is a measure of the time expended in energy output

in the second phonation.

In order to measure each segment of the cry episode, the graphic level recorder (described in the previous section on intensity) was utilized. The temporal segments in the cry episode could be identfied









53

and measured by noting the boundaries indicated by sudden changes in intensity.



Perceptual analyses

This project was developed as an attempt to establish the

acoustic and temporal correlates of the cry signal. However, certain details of the cry signal tend to be missed by the types of electoacoustical analyses used here. For example, although it was clear from listening that some phonations were noisy, this parameter was difficult to quantify. Moreover, most of the analysis procedures used in this project allow for the examination of a signal independant of time (i.e., rather than moment by moment). Consequently, transient events within the cry signal tended to be overlooked. While listening to the cries, it became clear that a thorough description of the signals required the detailing of particular types of events within each signal. In addition, it was the author's judgement that first cries were rougher and more strained than second cries, although that judgement could not be verified without testing other individuals. As a result, two different types of perceptual analyses were carried out.

In the first, the author simply rated each cry for the presence of four characteristics:

1) roughness--briefly intermittent to continuous and "saturated"

levels of embedded noise (aperiodic phonation or "dysphonation").

2) register shifts--"voice breaks" or sudden transitions in vocal

output from one register to another (generally a higher one) and

back again.









54

3) frequency variability--extreme fluctuations in pitch. These broad

variations in frequency could be glides that were increasing or decreasing in frequency, or wavers in which the frequency seemed

to "wander."

4) vocal fry--this phenomenon tended to occur at the end of a

sustained phonation, when available air to power the cry was

greatly reduced.

The second analysis procedure focused upon the perceived level of roughness and strain in each cry. Roughness was broadly defined for the task as noise, aperiodicity or harshness in the signal during phonation. It related to the first parameter of the author's judgements. On the other hand, strain related to the perception of inordinate exertion during phonation, and probably would include the perception of fry (#4 above). For this procedure, a panel of five trained listeners--speech pathologists and/or phoneticians who were all specialists in voice--were asked to rate each cry on a five point scale (1=least, 5=most) for levels of strain and roughness.



Statistical Analysis Procedures

The procedures outlined above resulted in a data-base consisting of measurements from six acoustic/temporal analysis procedures (involving 109 parameters), as well as perceptual judgements by a panel of trained listeners and by the author. This study was intended primarily to be descriptive in nature; consequently, means and standard deviations for all phonations were determined for all parameters.









55

Moreover, one aim of this study was the comparison of first and second phonation (Pl and P2) cry characteristics for each cry. Thus, means and standard deviations for P1 and P2 were calculated for all frequency, formant, long term spectral and intensity parameters. In addition, the mean and standard deviation of all listeners' perceptual judgements (relative to roughness and strain) was calculated for each phonation. Paired comparison t-tests were applied to the raw data for P1 and P2 from each of these acoustic and perceptual parameters.

Finally, in an effort to determine the strength of relationship

between each parameter and all other parameters, a Pearson correlation matrix was calculated.















CHAPTER III
RESULTS

The pain cries of 31 infants were studied. Subjects, who were between the ages of 26 and 30 days, were administered the BNBAS; the cries were elicited by the pain stimulus (or "poke") administered as part of that examination. All infants whose cries were analyzed were found to be within normal limits both in terms of body size (on the Ponderal Index--Appendix I) and developmentally (on the Brazelton--Appendix J).

As was discussed in Chapter II, the recorded cries were subjected to six acoustic and temporal analyses: 1) fundamental frequency, 2) jitter, 3) formant frequencies 4) long-term spectral composition, 5) intensity and 6) timing. Moreover, a panel of trained listeners perceptually judged each cry in an attempt to characterize levels of "roughness" and "strain." The results obtained from each of these procedures are presented below.



Acoustical Analyses

Fundamental Frequency (FO)

As stated, a recording was made of the first two phonational portions of the cry response to a pain stimulus. Each separate phonation was subsequently rerecorded to form a (continuous) cry sample lasting approximately twenty seconds. Analyses then were carried out for the following fundamental frequency parameters using the IASCP Fundamental Frequency Indicator (FFI): 1) mean (X) frequency 56








57

in Hz and semitones (ST), 2) standard deviation (SD) of that frequency in ST, 3) modal (Mode) frequency (Hz and ST), 4) lowest (Lo) detected frequency (Hz and ST), 5) highest (Hi) detected frequency (Hz and ST) and 6) range (ST). In addition, the obtained mean FO values were confirmed by examination of modal peaks occurring on long-term spectral plots and by consideration of FO from narrow-band time-frequency-amplitude (t-f-a) spectrograms. Analysis of a related vector, jitter (JIT), was carried out using the frequency data generated by FFI; these values are presented in the fundamental frequency tables as well but will be discussed in a later section.

The results of the individual FO analyses (as well as group means and standard deviations for each parameter) for the first two phonational portions of each cry* are presented in Table 3-1. The verification of the fundamental frequency data are presented in Table 3-2.

As may be seen from consideration of Table 3-1, a wide range of

individual mean FO values was found for the cries; these values ranged from 312 to 1299 Hz (X=505 Hz, SD=3.0 ST), a difference of more than two octaves. However, upon closer examination (see the scatterplot presented in Figure 3-1), the data clustered into two frequency categories. Of 61 cries analyzed, 57 (93.4%) registered a mean FO below 600 Hz, while the remaining four (6.6%) were above 900 Hz. When the means of each of these clusters were calculated (See Table 3-3), the average of the low frequency cries dropped to 465 Hz (SD=3.0 ST), and the "hyperphonational" cries were more than one octave higher at

* There are two samples for all but one (#11) of the 31 infants.
In that case, the baby achieved one full cry in response to the
stimulus but then quieted.








58

Table 3-1. Fundamental frequency values for first phonation
(Pl) and second phonation (P2) cries. Values were
generated using the IASCP Fundamental Frequency Indicator (FFI). Frequency values are presented
in Hertz (Hz) and/or semitones (ST).


Cry R R SD Mode Lo Hi Range %
# Hz ST ST Hz ST Hz ST Hz ST ST JIT

1.1 557 61 4.2 494 59 291 50 1068 73 23 15.0
1.2 495 59 2.5 494 59 288 50 605 63 13 8.0
2.1 488 59 4.3 554 61 330 52 798 67 15 9.6
2.2 435 57 1.7 440 57 349 53 530 60 7 8.1
3.1 479 58 5.0 440 57 255 48 988 71 23 10.9
3.2 471 58 4.0 466 58 275 49 719 66 17 21.4
4.1 541 61 8.4 523 60 234 46 1319 76 30 11.5
4.2 312 51 11.5 370 54 87 29 1397 77 48 25.3
5.1 381 55 2.8 370 54 257 48 544 61 13 9.4
5.2 445 57 5.1 392 55 246 47 1110 73 26 5.3
6.1 404 56 6.0 392 55 233 46 494 59 13 8.5
6.2 465 58 1.6 440 57 370 54 648 64 10 3.2
7.1 420 56 2.2 415 56 277 49 623 63 14 7.9
7.2 457 58 1.3 466 58 370 54 509 60 6 6.8
8.1 487 59 1.6 466 58 349 53 641 64 11 9.0
8.2 520 60 1.0 523 60 416 56 623 63 7 4.7
9.1 423 56 1.4 440 57 339 52 466 58 6 10.1
9.2 455 58 2.0 440 57 294 50 587 62 12 5.4
10.1 461 58 6.1 440 57 220 45 880 69 24 14.1
10.2 548 61 2.0 554 61 415 56 659 64 8 9.9
11.1 399 55 3.3 415 56 262 48 1175 74 26 10.4
12.1 445 57 10.2 415 56 139 37 1319 76 39 12.6
12.2 537 60 2.6 494 59 441 57 641 64 7 9.9
13.1 497 59 1.9 494 59 393 55 587 62 7 10.6
13.2 513 60 1.6 523 60 415 56 659 64 8 7.2
14.1 1299 76 1.5 1319 76 848 68 1480 78 10 3.4 14.2 586 62 1.1 554 61 467 58 1397 77 19 4.8 15.1 572 62 2.2 554 61 432 57 726 66 9 11.8
15.2 594 62 1.2 587 62 523 60 698 65 5 5.8
16.1 956 70 1.8 880 69 762 67 1319 76 9 4.7
16.2 595 62 7.1 831 68 277 49 988 71 22 13.6
17.1 391 55 2.2 392 55 262 48 554 61 13 9.7
17.2 365 54 2.7 349 53 262 48 509 60 12 6.5
18.1 359 53 8.4 466 58 139 37 1047 72 35 13.1
18.2 425 56 2.2 392 55 311 51 554 61 10 9.8
19.1 389 55 5.9 370 54 294 50 784 67 17 14.3
19.2 443 57 3.0 466 58 262 48 610 63 15 7.5
20.1 555 61 1.9 523 60 360 54 784 67 13 7.6
20.2 574 62 3.2 554 61 370 54 1319 76 22 6.1
21.1 484 59 1.3 466 58 392 55 623 63 8 6.6
21.2 527 60 1.2 523 60 415 56 641 64 8 3.8
22.1 389 55 2.3 392 55 196 43 1397 77 34 16.0
22.2 459 58 2.0 466 58 349 53 554 61 8 13.3








59


Table 3-1 -- continued.

Cry 7 I SD Mode Lo Hi Range %
# Hz ST ST Hz ST Hz ST Hz ST ST JIT

23.1 414 56 1.7 415 56 302 50 587 62 12 6.4
23.2 404 56 2.1 415 56 277 49 539 61 12 10.2
24.1 473 58 1.5 466 58 370 54 523 60 6 4.9
24.2 570 61 1.3 554 61 440 57 784 67 10 3.1
25.1 417 56 1.7 415 56 277 49 554 61 12 6.0
25.2 431 57 4.2 415 56 294 50 1319 76 26 6.4
26.1 478 58 1.8 494 59 293 50 932 70 20 7.6
26.2 457 58 0.7 440 57 392 55 554 61 6 2.2
-27.1 507 59 1.6 523 60 392 55 587 62 7 6.3
27.2 527 60 1.3 494 59 415 56 622 63 7 4.4
28.1 404 56 2.2 415 56 294 50 494 59 9 10.5
28.2 437 57 1.4 415 56 349 53 554 61 8 6.2
29.1 1126 73 2.0 1245 75 847 68 1371 77 9 8.8
29.2 938 70 3.5 988 71 466 58 1480 78 20 4.1
30.1 416 56 2.9 415 56 254 47 587 62 15 10.4
30.2 394 55 2.4 349 53 330 52 554 61 9 5.7
31.1 419 56 0.9 415 56 370 54 498 59 5 3.4
31.2 413 56 2.0 415 56 349 53 494 59 6 8.6

X 505 59 3.0 505 59 347 52 796 66 15 9.0
SD 171 4 2.3 183 5 138 7 318 6 9 4.0








60

Table 3-2. Corroboration of FFI fundamental frequency
(FO) values. Mean and modal FO values were
generated for checking from
time-frequency-amplitude (TFA) spectrograms
(mean) and Fast Fourier Transform (FFT)
spectral analysis (FFT).


Cry FFI FFI TFA FFT Cry FFI FFI TFA FFT
# X Mode X Mode # 7 Mode 7 Mode

1.1 557 494 570 560 1.2 495 494 500 500
2.1 488 554 460 2.2 435 440 400 430
3.1 479 440 450 480 3.2 471 466 450 460
4.1 541 523 550 560 4.2 312 370 280
5.1 381 370 360 380 5.2 445 392 430 430
6.1 404 392 400 420 6.2 465 440 450 460
7.1 420 415 400 410 7.2 457 466 460 460
8.1 487 466 490 460 8.2 520 523 525 520
9.1 423 440 430 430 9.2 455 440 500 500
10.1 461 440 450 460 10.2 548 554 550 560
11.1 399 415 400 430 --- --- --- --- --12.1 445 415 440 440 12.2 537 494 525 520
13.1 497 494 420 430 13.2 513 523 450 520
14.1 1299 1319 1330 1290 14.2 586 554 600 590
15.1 572 554 560 570 15.2 594 587 560 570
16.1 956 880 900 940 16.2 595 831 650 590
17.1 391 392 375 440 17.2 365 349 340 360
18.1 359 466 330 320 18.2 425 392 375 410
19.1 389 370 410 380 19.2 443 466 450 460
20.1 555 523 570 590 20.2 574 554 550 590
21.1 484 466 480 480 21.2 527 523 510 520
22.1 389 392 400 410 22.2 459 466 440 440
23.1 414 415 430 410 23.3 404 415 400 410
24.1 473 466 500 580 24.2 570 554 580 590
25.1 417 415 430 410 25.2 431 415 400 410
26.1 478 494 480 480 26.2 457 440 450 460
27.1 507 523 500 500 27.2 527 494 500 500
28.1 404 415 400 410 28.2 437 415 450 450
29.1 1126 1245 1200 1190 29.2 938 988 1100 1120
30.1 416 415 450 460 30.2 394 349 450 460
31.1 419 415 400 400 31.2 413 415 400 410

FFI FFI TFA FFT
X Mode X X

X 505 505 508 510 SD 171 183 187 183


*--could not be determined








61












1,40

1.30- a

1.20




tOO
0 0
0.90



0,70

0-60 %0 03
0 C1 0


0.40- 0 0 0 a MI

0.30 0

0.20
0 20 40
Figure 3-1. Scatterplot of mean fundamental
frequency values (in kHz) for each
cry sample.








62

Table 3-3. Summary table of frequency parameters. Mean and
standard deviation values are provided for each
parameter in Table 3-1. Values are presented
also for the categories of 1) low frequency
phonations, 2) high frequency phonations,
3) first phonations and 4) second phonations.


X X SD Mode Lo Hi Range %
Hz ST ST Hz ST Hz ST Hz ST ST JIT

All Cries (N=61)
ZA 505 59 3.0 505 59 347 52 796 66 15 9.0
SDA 171 4 2.3 183 5 138 7 318 6 9 4.0

All Low Frequency Cries (XFOK 700 Hz) (N=57) A(L) 465 58 3.0 457 58 320 51 753 65 15 8.9
SDA(L) 66 2 2.3 60 2 85 5 281 6 9 4.3

All High Frequency Cries (370>700 Hz) (N=4) A(H) 1080 72 2.2 1053 72 731 65 1413 77 12 5.3 SDA(H) 169 3 0.9 219 4 181 5 81 1 5 2.4

First Cries Only (N=31)
21 517 59 3.3 505 59 344 51 772 67 16 10.0
SD1 215 5 2.4 183 5 173 7 308 7 9 6.2

Low Frequency First Cries (21FO (700 Hz) (N=28) X1(L) 452 57 3.0 457 58 293 49 771 66 16 10.0
SDI(L) 59 2 2.0 60 2 74 5 281 6 9 3.0

High Frequency First Cries (X1FO >700 Hz) (N=3) X1(H) 1127 73 1.8 1053 72 819 68 1390 77 10 6.0 SD1(H) 172 3 0.3 219 4 49 1 82 1 1 3.0

Second Cries Only (N=30)
72 493 59 2.7 494 59 350 52 762 66 13 7.9
SD2 110 3 2.2 131 4 87 6 312 6 9 5.0

Low Frequency Second Cries (12FO (700 Hz) (N=29) Y2(L) 478 58 3.0 464 58 346 52 737 65 13 7.8
SD2(L) 71 3 2.0 66 3 86 6 286 5 9 4.8

High Frequency Second Cries (2F0 >700 Hz) (N=1) X2(H) 938 71 3.5 988 71 466 58 1480 78 20 4.1








63

1080 Hz (SD=2.2 ST). In addition, it should be noted that the four high frequency cries were produced by three infants; three of the cries were first phonations, while the fourth cry was a second phonation following a high frequency first phonation by one of the three. Thus, the normal cry response is bi-modal. While the normal cries in this study were most often low frequency in nature, a few hyperphonational cries occurred as well, generally in the first phonation following the stimulus.

When first cries (Pl) were compared to second cries (P2), there were no apparent differences in mean FO (X(P1)=517 Hz, X(P2)=493 Hz); a paired t-test comparison (See Table 3-4) confirmed this relationship (t=0.76, P=0.27). Further, when the standard deviations (i.e., the variability) of first cries (SD(P1)=3.3 ST, SD(P2)=2.7) were compared to those of second cries using a paired t-test, the difference, while stronger, still was not significant (t=1.20, P=0.12). Thus, the characteristics of fundamental frequency (and laryngeal dynamics) did not appear to change from the first cry to the second.

While the mean provides information concerning the mathematical center of frequency production, the mode describes the frequency produced most consistently. Thus, it was interesting to note the correlation between the values for the two parameters (r=0.95, P (

0.001). Mean value for both parameters was the same (505 Hz), although the modal standard deviation was one semitone greater. In only three cases out of 61 was there more than a two semitone difference between them (4.1--3 ST, 18.1--6 ST, 16.2--7 ST). The strong correspondence between these two parameters suggests that the








64




Table 3-4. Results of paired difference t-tests between
selected frequency parameters of phonation 1
and phonation 2.


Mean
Parameters Diff SD One Tail
Tested ST ST S.E. df t P=

X1 vs. X2 (ST) 0.5 3.6 0.7 29 0.76 .27

SD1 vs. SD2 (ST) 0.6 2.6 0.5 29 1.20 .12

Mdl vs. Md2 (ST) 0.4 3.7 0.7 29 0.54 .30

Lol vs. Lo2 (ST) 0.9 7.8 1.4 29 0.65 .26

Hil vs. Hi2 (ST) 0.6 6.5 1.2 29 0.51 .31

Rgl vs. Rg2 (ST) 1.6 10.8 2.0 29 0.80 .28

JT1 vs. JT2 (%) 1.5 5.1 0.9 29 1.56 .06*


* Significant at the .05 level.








65

(easier to obtain) modal frequency may provide an index of fundamental frequency behavior that is as accurate as the mean.

The distribution of the observed FO values (i.e., waveform

periods detected during the analysis of each sample) within the group of cries was quite broad, extending over four octaves (49 ST) between 87 and 1480 Hz. Individual cries varied as little as 5 ST (not quite one-half octave) or as much as 48 ST (or four octaves). However, the means of the detected low and high frequency values were 347 (SD=7 ST) and 796 Hz (SD=6 ST) respectively, a range of about 15 ST (SD=9 ST) or slightly more than 1 octave. Again, there was no difference between the ranges of first and second cries. Thus, while most infants cried at approximately the same mean frequency, the variability around that mean was considerable for each infant, indicating little volitional control over the crying act.

Jitter

The jitter metric is a measure of the percent of change in

fundamental frequency from cycle to cycle; further, it provides an index of the variability of vocal fold vibration. As may be seen in Table 3-1, the values obtained for jitter during crying phonation were extremely wide ranging; they extended from 2.2% (#26.2) to 25.3% (#4.2), with a mean of 9.0% and a standard deviation of 4.0%. (The high and low jitter values were validated approximately using the H/N ratio, described in Chapter II; the obtained values using the second procedure were 3.7 and 29.2%.) A significant (t=1.56, P(0.10) difference was found between the jitter values for the first and second cry phonations. The wide range of values obtained for this









66

parameter are highly correlated with both the high FO standard deviation values (r=0.66, P (0.001) and the wide range of detected frequencies (r=0.61, P <0.001) for FO. Indeed, all these relationships point to the extreme variability of vocal fold vibration during crying, probably due to intermittent aperiodicity during phonation as well as continual changes in respiratory and laryngeal dynamics.

Formant Frequencies

Formant frequencies are spectral resonances that are generated by the voice. As such, they provide information about the dynamics of the vocal tract during sound production. One of the goals of this study was to determine the formant values that are produced during crying, and to attempt to relate them to the characteristics of the infant vocal tract.

The resonance values obtained for each cry are presented in Table 3-5. Three to five peaks were evident on the long term spectral curves generated for each phonational sample (See next section and Table K-1 in Appendix K). If less than five peaks were apparent for a phonational sample, it was assumed (for purposes of calculation) that certain resonances might be obscured by aperiodic energy within the same bandwidth, or by having been "averaged out" over time. In addition, while most of the spectral peaks were confirmed by observation of concentrated energy regions on t-f-a spectrograms, some resonances were not apparent on the spectrograms. This was probably due to the fact that LTS displays energy summated over time, while the t-f-a procedure displays a moment-to-moment energy distribution.








67

Table 3-5. Formant frequency values for each cry sample.
Mean and standard deviation values are presented
for first and second phonation, and for all cries.
Formant values were derived from long term spectral analysis and t-f-a spectrographic
analysis.


Cry F1 F2 F3 F4 F5 Cry F1 F2 F3 F4 F5 # Hz Hz Hz Hz Hz # Hz Hz Hz Hz Hz

1.1 1060 1700 2140 3820 6080 1.2 1000 1510 2140 3820 6830 2.1 660 1190 3820 7660 2.2 940 1190 1700 3030 6080
3.1 940 1340 1910 3820 7660 3.2 940 1340 2140 2700 4820 4.1 1060 2400 4820 6080 4.2 940 1340 2400 4540 7660
5.1 1190 1510 2400 3820 7660 5.2 880 1190 2140 3820 *
6.1 830 1700 3820 5410 6.2 940 1340 2700 3820 6830
7.1 830 1190 1910 3400 4290 7.2 940 1340 2140 2700 4290 8.1 940 1340 1910 2400 4820 8.2 1060 1510 2400 4290 6830 9.1 830 1340 2400 4290 9.2 1060 1510 2700 3820 6080
10.1 940 1340 2700 3820 5410 10.2 1120 1510 1910 2700 6080 11.1 1000 2400 3820 6080 ---- ---- ---- ---- ---- ---12.1 830 1340 3030 5410 12.2 1000 2140 2700 3400 *
13.1 830 1340 2700 4820 13.2 940 1340 1910 3820 6830
14.1 1910 3820 5410 14.2 1190 2140 3400 5410 15.1 1190 1700 2140 3400 5720 15.2 1190 1700 2700 3820 6080 16.1 1910 4290 16.2 940 1190 1910 3820 6830
17.1 940 1190 2400 3400 4820 17.2 1060 1340 2400 3400 5410 18.1 660 1060 2400 3820 5410 18.2 830 1060 2140 3400 6080 19.1 780 1190 2700 3400 6080 19.2 940 1340 2700 3400 7660 20.1 830 1060 1700 3820 6080 20.2 1120 1910 3820 6080
21.1 940 1340 2400 3400 6080 21.2 1060 1510 1910 3400 7660 22.1 830 1190 1910 4040 6080 22.2 830 1190 1700 3400 5410 23.1 830 1190 2140 3820 6830 23.2 830 1190 2140 3820 6830 24.1 940 1910 3820 6830 24.2 1190 1700 2400 3400 6830
25.1 830 1190 2400 4290 6080 25.2 830 1190 2400 4290 6080 26.1 660 880 2260 3030 6030 26.2 940 1340 2400 3820 6080 27.1 940 1340 2400 3820 6080 27.2 940 2400 3820 7660
28.1 830 1190 2140 3820 6080 28.2 940 1510 1910 3820 6080 29.1 1510 2400 4290 6830 29.2 2140 3030 6080
30.1 940 1340 2140 3820 6080 30.2 1000 1510 2400 4290 6080 31.1 1000 1510 2400 4290 6080 31.2 740 1190 2140 3400 6080

X(Hz) 896 1316 2261 3730 5888 964 1376 2211 3535 6192
SD(Hz) 134 201 280 433 823 108 225 302 456 870

ALL PHONATIONS
Fl F2 F3 F4 F5

X(Hz) 929 1347 2235 3631 6043
SD(Hz) 126 214 290 452 853


*--no formant was evident in this frequency range








68

Thus, a slight but continuous resonance might be evident using LTS but might not be apparent using spectrographic analysis.

Mean values for the five resonant frequencies (R1 to R5) were 930, 1347, 2235, 3631 and 6043 Hz respectively. These peaks were approximately confirmed by generation of the averaged long term spectral data for all phonations (See Table 3-7 and Figure 3-2 below). When all spectral data are averaged, five peaks emerge at approximately 940, 1190, 2140, 3400 ad 6080 Hz. The differences between means of the respective peaks using these two methods are small enough to be within the standard deviation for each value.

Although the spectral peaks represent vocal tract resonances, it is difficult to associate the peaks with particular vocal tract characteristics. Individual cry phonations most often encompassed a range of vowel-like sounds (and vocal tract configurations). Since the frequencies associated with each sound were analyzed over time, the averaged resonant values tend to be related to a general vocal posture during crying rather than to specific vowel or vowel-like sounds.

Similarly, the complexity of the obtained spectral peaks and a lack of anatomical data about infants preclude definitive statements concerning the relationship between vocal tract size and resonant frequencies. For example, if the vocal tract transfer function (described in Chapter I) is applied to the first resonant peak (929 Hz), an average vocal tract length of 9.1 cm is described. This value is larger by 1.0-1.5 cm than estimates of the neonatal vocal tract posited by other investigtors, but potentially still within the bounds of accuracy. However, it is possible also that the first peak simply








69

represents the first harmonic overtone, since it is twice the frequency of the fundamental. If R1 represents only a harmonic overtone, it might be expected that other peaks would be evident at approximate multiples of the fundamental, but there are none. If the second peak is assumed to be the first formant, a vocal tract length of 6.3 cm is obtained, slightly smaller than the 7.5-8.0 cm estimation, but also a reasonable estimate of vocal tract size in infants. As discussed earlier, it may be fruitless to apply such modeling procedures to data on infants since very little is known about the physical characteristics of the system being studied. Further, knowledge of the relationship between crying spectral peaks and vocal tract configuration may depend on further evidence relative to the anatomical characteristics of the infant sound production mechanism, as well as data about specific infant sounds and their frequency correlates.

In order to test for possible differences in vocal tract

configuration between the first and second phonation, formant values were compared using a paired t-test. The results of that procedure are presented in Table 3-6. Only Fl showed a significant difference (PK<0.01) due to sequence of phonation. The similarity between the resonances for the first and second cries suggests that the configuration of the vocal tract remains approximately the same during the progress of the cry episode.

Long Term Spectral Distribution

While the formant frequencies provide information about the

resonant characteristics of the vocal tract, examination of the entire








70





Table 3-6. Results of paired difference t-tests between
formant frequency values of phonation 1 and
phonation 2.


Mean
Parameters Diff. SD One Tail
Tested Hz Hz S.E. df t P=

F1 70.4 146.4 28.7 25 2.45 0.01*

F2 92.9 247.8 50.6 23 1.44 0.08

F3 62.6 355.6 68.4 26 0.92 0.31

F4 197.0 629.2 121.1 26 1.63 0.06

F5 265.4 1087.2 213.2 25 1.25 0.11


*significant at the P (0.01 level.








71

spectrum can provide information about the relative energy levels between frequency intervals.

The analysis of spectral distribution (over time) resulted in -dB values (re: 1 volt) for 33 frequency intervals. Table K-1 (in Appendix K) contains a listing of the obtained values for each cry. Table 3-7 provides a summary of mean and standard deviation values by frequency interval for 1) all cry phonations, 2) first phonations and 3) second phonations. In turn, Figure 3-2 shows a mean power spectral curve for all cries and Figure 3-3 compares the spectral curves for the first and second phonation. Figures 3-4 and 3-5 display the spectral curves for the first and second phonations with a representation of the standard deviation for each interval on the ordinate.

As was noted in the last section, the averaged spectrum for all phonations exhibited peaks at 460, 940, 1190, 2140, 3400 and 6080 Hz, thus closely approximating the mean fundamental frequency value and the formant frequency values obtained in other procedures. The greatest amount of energy was centered between approximately 830 and 2700 Hz, with the maximal peak occurring at 1190 Hz. However, with the exception of the peak at 6080 Hz (i.e., the upper end of the spectrum), the difference between the maximal and minimal peak was only 5 dB. Also, the peak at 3400 Hz appears to represent the beginning of the terminus of the distribution, with the energy rolling off rapidly (10 dB/octave) at the higher frequencies.

Table 3-7 and Figures 3-3, 3-4 and 3-5 present the averaged data for the first and second phonations. The first phonation exhibits greater energy levels than the second phonation in the upper and lower








72

Table 3-7. Averaged long term spectral data for all cry phonations,
for first and second phonations. Values represent mean total energy levels for each bin in -dB re: 1.00 volt.


All Phon. 1st Phon. 2nd Phon. Paired Two
Frequency X SD X SD X SD diff. Tail
Range -dB -dB -dB -dB -dB -dB t P=

140-160 45.8 4.1 44.3 2.9 47.4 4.5 -3.48 0.002** 160-180 46.6 4.3 45.1 3.3 48.1 4.8 -3.56 0.001** 180-200 46.0 4.4 44.5 3.6 47.6 4.7 -3.32 0.002** 200-220 47.1 4.6 45.6 3.8 48.6 4.8 -3.14 0.004** 220-260 46.9 6.2 44.6 6.1 49.3 5.4 -3.82 0.001** 260-300 47.2 6.0 45.3 5.2 49.2 6.2 -2.76 0.010** 300-340 46.9 6.5 44.7 5.6 49.1 6.7 -3.38 0.002** 340-380 50.7 13.2 47.5 11.6 53.9 14.1 -2.43 0.022*
380-440 44.6 13.1 43.6 12.8 45.4 13.5 -0.94 0.353 440-480 41.8 9.8 42.5 9.2 41.0 10.6 0.81 0.423
480-560 45.2 10.2 46.3 9.2 44.0 11.2 1.25 0.222
560-620 47.6 10.6 48.6 8.5 46.6 12.4 0.77 0.447
620-700 50.2 8.9 50.0 8.6 50.3 9.4 -0.16 0.876 700-780 46.6 9.1 45.4 9.4 47.8 8.8 -1.21 0.236 780-880 40.5 9.4 39.8 10.5 41.2 8.2 -0.93 0.363 880-1000 39.6# 9.8 39.8# 9.8 39.5# 10.0 0.19 0.852 1000-1120 40.1 9.2 40.3 7.3 39.9 10.9 0.19 0.854
1120-1260 38.2# 9.4 38.2# 7.6 39.0# 11.1 -0.36 0.719
1260-1420 42.5 9.5 41.6 8.9 43.5 10.2 -1.39 0.174 1420-1600 43.4 11.2 41.2 8.2 45.6 13.4 -1.90 0.067 1600-1800 44.1 13.2 42.5 12.6 45.8 13.8 -1.50 0.145 1800-2020 41.2 13.9 40.5 14.7 42.0 13.1 -0.54 0.593 2020-2260 39.6# 12.6 38.8 12.2 40.6# 12.9 -0.68 0.499 2260-2540 40.1 13.8 37.5# 11.3 42.7 15.7 -2.22 0.034* 2540-2860 41.7 11.4 39.7 9.2 43.8 13.1 -1.82 0.080 2860-3200 44.2 12.0 42.9 9.5 45.6 14.2 -1.06 0.298 3200-3600 43.6# 13.0 41.8 10.4 45.3# 15.2 -1.28 0.209 3600-4040 45.0 16.6 41.7# 13.5 48.3 18.9 -2.00 0.055 4040-4540 48.4 17.6 44.6 14.5 52.3 19.8 -2.50 0.018* 4540-5100 50.8 17.9 46.9 14.4 54.7 20.4 -2.94 0.006** 5100-5720 51.5 15.8 48.6 11.5 54.5 19.1 -2.34 0.026* 5720-6440 51.3# 15.5 49.0 12.5 53.6# 18.1 -2.00 0.055 6440-7220 53.8 16.0 51.3 12.7 56.4 18.8 -2.00 0.055


#--spectral peak
*--significant at the 0.05 level.
**--significant at the 0.01 level






35


F

40F0F
4
45

0

50 IS



55r
0 0 O N M w~U 01) -4 AJ m ,F .4-J0 CON CD -P l OO)

Frequency
Figure 3-2. Averaged long term spectral curve for all phonatlons.






35



40



45

I

50
P-2



55
U, ~ ~ ~ ~ ~ ~ o -4N 6', 0o ui rii -44 0l PJ1-. C''O w \ c s 0Iu cc)
0 o o o o o o o o o o o ooD8No "o 8
Frequency
Figure 3-3. Comparison of the average long term spectral curves for the first and second phonations.








75

35




40




0
1~


50



55
-l. t y' 4L. L4 11 1 1
0000 00000000 co000o00 ga0 r) tow C 0~~ 0 C-,00000
Frequency
Figure 3-4. Averaged long term spectral curve for first phonation.




35 T T


T

40



45- L
I
T




50-1I


55 -_ _ _ _ __ _ _ _
., Ot )t )4 I -,Mt -j -I N Df. D4 00
00 0 0 r0LAm Wrm a a = O-->-ao ODm(o
ooooo6 ooooooggoommm
00 0000 0 00
Frequency
Figure 3-5. Averaged long term spectral curve for second phonation.








76

frequency regions, while energy levels are almost congruent in the central frequencies, where the peaks are clustered. However, the variability of the second phonation is consistently higher (P (0.01) than that of the first. As is evident from Figures 3-4 and 3-5, the lower frequency values are less variable than those in the central and upper frequencies, resulting in significant differences (P <0.05) between P1 and P2 in the spectral levels of the lower eight frequency intervals. In addition, the spectral values for three of the upper six frequency intervals are higher (P<0.05) for P1 than for P2.

The similarity between the shapes of the first and second

phonational spectral curves indicates that the acoustic energy output during phonation does indeed follow a stylized format. However, an increase in energy level for P1 in the region below the fundamental frequency (with no concomitant change in energy level in the central frequencies) suggests a lower signal-to-noise ratio for the first than the second phonation, probably due to an increase in fundamental frequency variability and/or aperiodicity. That is, as the variability in fundamental frequency rises, or as the level of low frequency phonational noise increases, so does the energy level in the lower frequencies.

The higher upper frequency energy levels for P1 relative to P2

probably are not related to noise in the signal, but rather to better high frequency resonance from greater mouth opening (Fant, 1970). That is, since the first phonation is more proximal to the stimulus, the articulatory excursion in the response may be more extreme, with a resulting increase in high frequency energy.








77

Intensity

The peak intensity of the cry provides an index of the infant's respiratory power and the strength of the response. Peak absolute intensity values (in dB SPL) for each cry phonation are presented in Table 3-8 (as measured 24 inches above the examination table, and directly above the head of the infant). As can be seen, cries ranged in intensity from 75 to 99 dB, with a mean and standard deviation for the first cry of 88.7 and 5.1 dB respectively, and 89.0 and 6.0 dB for the second phonation.

There was no significant difference in the intensity of first

cries vs. second cries (t=0.35, P=0.36). Furthermore, the likelihood of a louder cry on the first phonation (12/30, or 40%) was about the same as for the second (15/30, or 50%). In three cases (10%), both first and second cry were approximately of the same intensity. Temporal Analyses

As discussed in Chapter II, the cry signal was divided into

discrete segments that represented specific activities during the cry cycle. Each segments was timed for each infant; the results of that procedure are provided in Table 3-9.

Latency. Four different latency parameters were examined: 1)

the elapsed time between the stimulus and a pre-phonatory vocalization (St/Vl), 2) the elapsed time between the stimulus and the first phonation (St/Pl), 3) the elapsed time between a pre-phonatory vocalization (if it occurred) and the following first phonation (V1-P1) and 4) the elapsed time between the stimulus and the first occurring vocalization (i.e., either pre-phonatory or phonatory) (St/V1-P1). In addition, two parameters were counts of the number of








78

Table 3-8. Peak absolute intensity
values (in dB SPL) for
each cry phonation. The microphone was positioned 24" above the examination
table and directly above
the infant's head.


P1 P2
Subj. Peak Peak
# Int. Int.

1 97.0 96.0
2 93.0 86.0
3 89.0 89.5
4 94.0 94.5
5 92.0 83.5
6 94.0 95.0
7 87.0 77.5
8 90.0 97.0
9 95.0 94.0
10 93.0 97.0
11 88.0 ---12 81.0 88.0
13 78.0 84.0
14 80.5 89.0
15 85.0 89.0
16 95.5 95.0
17 90.0 89.0
18 85.0 87.0
19 87.0 91.0
20 92.0 85.0
21 87.5 89.5
22 78.0 78.0
23 90.0 85.0
24 92.0 99.0
25 82.5 92.5
26 84.0 84.0
27 86.0 75.0
28 90.0 90.0
29 94.0 91.0
30 91.0 94.5
31 87.0 85.0

X 88.7 89.0
SD 5.1 6.0








79

Table 3-9. Timing data for individual subjects. A key for the
variable notation is provided on the following page.


# St- P1 P2
Cry St-V1 Vs St-P1 VI-Pl V1/PI On-Off P1-P2 #Vs On-Off # Secs St-P1 Secs Secs Secs Secs Secs P1-P2 Secs

1 2.50 4 5.03 2.53 2.50 1.90 0.77 0 2.78 2 1.10 1 2.10 1.00 1.10 3.93 0.93 1 1.00 3 1.70 2 2.78 1.08 1.70 3.12 0.30 1 1.33 4 ---- 1.25 ---- 1.25 4.53 0.67 1 1.00
5 ---- 1.17 ---- 1.17 5.93 1.02 5 0.33
6 1.30 2 3.97 2.67 1.30 3.05 0.47 0 1.62 7 1.93 2 2.70 0.77 1.93 3.38 0.37 0 0.90 8 1.12 2 3.00 1.88 1.12 3.63 0.40 0 1.88 9 ---- 4.68 ---- 4.68 5.67 0.28 1 0.72
10 ---- 1.33 ---- 1.33 14.75 0.37 0 0.88
11 ---- 2.27 ---- 2.27 3.17 ---- ---12 1.22 3 4.07 2.85 1.22 2.47 0.53 0 0.92
13 ---- 0.50 ---- 0.50 3.77 0.40 0 1.42
14 1.48 1 2.20 0.72 1.48 2.08 1.07 0 1.22
15 ---- 1.57 ---- 1.57 0.73 1.37 1 0.50
16 ---- 1.12 2.42 0 0.72
17 ---- 1.40 ---- 1.40 8.43 0.38 0 1.87
18 ---- 0.82 ---- 0.82 4.07 0.38 0 1.85
19 1.13 2 1.55 0.42 1.13 9.90 0.37 0 2.18
20 4.03 1 4.90 0.87 4.03 1.70 0.67 0 1.77
21 1.73 1 2.47 0.74 1.73 2.63 0.43 1 1.57
22 ---- 2.00 ---- 2.00 3.08 0.70 1 1.43
23 ---- 0.90 ---- 0.90 7.83 0.63 0 0.82
24 1.57 2 3.73 2.16 1.57 2.75 0.40 0 1.90 25 ---- 1.77 --- 1.77 4.27 1.50 1 1.67
26 ---- 0.92 ---- 0.92 4.00 2.35 0 4.70
27 1.47 1 2.10 0.63 1.47 3.67 0.60 0 0.95
28 1.30 4 2.17 0.83 1.30 5.67 0.47 0 2.47
29 ---- 1.32 ---- 1.32 7.48 0.35 0 2.05
30 0.87 3 2.67 1.32 0.87 5.07 0.30 0 1.72
31 1.77 8 10.10 8.33 1.77 0.85 5.80 6 0.65

Y 1.64 2.4 2.58 1.83 1.60 4.34 0.89 0.67 1.49 SD 0.75 1.8 1.89 1.91 0.87 2.92 1.08 1.40 0.85


* The recording was lost of these portions of the cry from infant 16,
from stimulus presentation to just before the onset of the first
phonation.








80








Notes--Table 3-9.

St-V1--Time in seconds secss) from stimulus to first vocalization. #Vs St-P1--# of vocalizations between stimulus and 1st phonation. St-P1--Time secss) from stimulus to 1st phonation. VI-Pi--Time difference between the onset of a prephonatory
vocalization and the following phonation.

St-V1/Pl--Tile secss) from stimulus to either the 1st vocalization
or the 1st phonation.

Pl On-Off--Duration of 1st phonation. Pl-P2--Time secss) between end of 1st phonation and beginning of
2nd phonation.

#Vs P1-P2--# of vocalizations between end of P1 and beginning of P2. P2 On-Off--Duration of 2nd phonation.








81

non-phonatory bursts in the period before P1 (i.e., #Vs St-P1) and before P2 (#Vs P1-P2).

Sixteen of the thirty-one infants, or about half, began the cry event with one to eight pre-phonatory vocalizations (X=2.4, SD=1.8), beginning slightly less than a second to more than four seconds (7=1.64, SD=0.75) after application of the stimulus. These vocal bursts generally were expiratory in nature, and often had the quality of a "cough."

As may be seen, if the infant began the cry with a pre-phonatory burst, the latency value to the onset of phonation (7=2.58, SD=1.89) became inflated, since it actually provides information about the latency to two different events. However, the amount of time between the pre-phonatory event (when one occurs) and the onset of phonation (V1-P1, X=1.83, SD=1.91) provides information relative to the ability of the infant to resolve the "coughing" sequence and (vocally) exhale. It is perhaps noteworthy that this parameter is approximately the same as for the latency to the pre-phonatory burst. (It should be mentioned also that this comparison is obscured by the fact that infant 31 had an unusual cry sequence characterized by many pre-phonatory bursts and extremely short phonational durations. If the data from this infant is excluded from the sample, then X(V1-P1)=1.4 and SD(V1-P1)=0.83. Other statistics change as well: Y,SD(#Vs/St-Pl)=1.07,1.28; X,SD(St-P1)=2.32,1.27; Y,SD(#Vs/P1-P2)=0.48,0.99; X,SD(P1-P2)=0.72,

0.56.)

The fourth latency variable (St-V1/P1, 7=1.6, SD=0.87) is of interest also since it provides information on the response (i.e., sensory, neurological and motor processing) time from the stimulus to








82

any first vocalization. In addition, this parameter is the most comparable to those examined in other studies involving latency (See Table 1-4). The mean values, which are derived from St-V1 values and St-P1 values (where there was no pre-phonatory event), are essentially the same as those from the St-V1 data only. To be specific, calculation of the degree of difference using an unpaired t-test between the data for the two parameters indicate nonsignificance (t=0.85, P=0.42). That is, the time required for infants to process the stimulus and then respond was essentially stable, independent of the manner of the response (i.e., prephonatory vocalization or first phonation).

Duration. The duration of both the first and second phonations were measured from the onset to the cessation of vocal/expiratory activity. As can be seen in Table 3-8, the mean and standard deviation of the first phonation were 4.34 and 2.92 secs. respectively, while for the second phonation the values were 1.49 and

0.85 secs. When a paired t-test was carried out between these two parameters, the difference was found to be significant at the P< 0.01 level (t=5.16, P=0.00038). The relative energy expenditure in the cries (as evidenced through duration) is, as with several other parameters, consistently emphasized proximal to the initiating stimulus. The first phonation represents the abrupt release of air after the inspirational response to the stimulus, while the second phonation duration generally begins to conform to the regular patterning of the basic cry that appears to follow.








83

The Perceptual Analyses

As was discussed in Chapter II, certain characteristics of the cry tend not to be assessed in the types of acoustical and temporal analyses described above. Thus, in an attempt to more fully describe the cry act, two separate procedures were carried out. In the first, the investigator carried out a subjective analysis of each cry relative to the presence or absence of: roughness, vocal fry, extreme variations in frequency and register shifts (or breaks). In the second, more formal procedure, a panel of trained listeners rated each cry for levels of roughness and strain. Author's judgments of "cry quality" characteristics

The author's judgements relative to four characteristics of cry quality are presented in Table L-1 of Appendix L.

Roughness. A three level scale of roughness was applied. Out of 61 cries, five cries were rated as extremely rough, 29 were rated as moderately noisy and 27 cries had clear phonation. When the first and second cries were compared, first cries were more often extremely noisy (Pl=3, P2=2) or moderately noisy (Pl=16, P2=13), and often had clear phonation (Pl=12, P2=15).

Vocal fry. Over 40% (25) of the cries were found to exhibit

vocal fry during some portion of the signal. First cries were more than twice as likely (Pl=17, P2=8) to include fry as second cries. Moreover, all those infants who exhibited fry in the second phonation also had exhibited it in the first.

Variations in frequency. Over 20% (13) of the cries displayed an inordinate amount of frequency variability. Again, this feature was more likely to occur in the first phonation (P1=10) than in the








84

second (P2=3). However, there did not appear to be a consistent relationship between P1 and P2 wavers. In all infants but one, wavers occurred in either the first or the second phonation, but not in both.

Register shifts. Six infants displayed register shifts at some point in the cry. Five of the infants shifted upward. The sixth infant, who hyperphonated for most of the cry, made the transition into the modal register, and then achieved loft once more. Four of the register shifts occurred in the first phonation. Of the two occurring in the second phonation, one was part of a hyperphonation (mentioned above) and the other was part of a modal phonation which followed a hyperphonatonal first cry. Perceptions of roughness and strain

As was indicated previously, high levels of noise (i.e.,

"dysphonation") often were present during phonation. Although it has correlates in such variables as FO standard deviation and jitter, phonational noise is a difficult parameter to measure. Indeed, it's physical specification (relative to cry) rarely has been addressed in the literature (Golub (1980) is a notable exception). Consequently, a small panel of highly trained listeners (clinical and experimental specialists in voice) was asked to judge each cry on a scale of one to five for two parameters (roughness and strain). Roughness was defined as perceived aperiodicity, noise or harshness, while strain was defined as a noticable increase in effort or difficulty due to constriction during vocal output. The averaged results are presented in Table 3-10. Individual responses are provided in Tables L-2 and L-3.








85

Table 3-10. Perceptual responses of five trained listeners
to each cry. Listeners were asked to rate the cries for strain and roughness along a 5 point
scale (1:least-5:most). Mean and standard
deviations for all responses are provided, as
well as differences in mean response between the
first and second phonation of a cry event.

STRAIN ROUGHNESS
Phonation Phonation
Baby 1st 2nd 1st-2nd 1st 2nd 1st-2nd
# X SD X SD X1-X2 X SD X SD %1-n

1 4.2 0.4 3.2 0.4 1.0 2.6 1.5 2.5 0.9 0.0 2 5.0 0.0 1.8 0.4 3.2 4.8 0.4 2.4 1.1 2.4
3 2.8 0.8 1.6 0.5 1.2 4.0 1.0 2.0 0.7 2.0 4 4.4 0.5 3.8 1.3 0.6 3.6 1.1 4.4 0.9 -0.8
5 5.0 0.0 3.6 1.1 1.4 4.4 0.9 2.4 0.9 2.0 6 3.2 1.1 1.0 0.0 2.2 3.0 1.0 1.0 0.0 2.0 7 2.0 0.7 1.0 0.0 1.0 2.0 1.0 1.2 0.4 0.8
8 4.6 0.5 2.8 1.6 1.8 3.6 0.5 3.8 0.4 -0.2
9 4.8 0.4 2.8 0.8 2.0 4.4 0.9 3.4 1.1 1.0 10 1.8 0.8 3.4 0.5 -1.6 1.8 0.4 4.0 1.0 -2.2
11 4.4 0.5 --- --- --- 3.8 0.8 --- --- --12 3.0 1.2 1.4 0.5 1.6 4.6 0.5 2.4 0.5 2.2 13 2.8 0.8 1.6 0.9 1.2 3.0 0.7 2.0 0.0 1.0 14 2.2 0.8 2.2 0.8 0.0 1.4 0.5 3.4 0.5 -2.0
15 4.6 0.9 4.4 0.9 0.2 3.6 0.5 2.8 0.8 0.8 16 2.6 1.1 2.0 0.7 0.6 3.8 0.4 3.8 0.4 0.0 17 4.4 0.9 2.0 0.7 2.4 4.0 0.7 3.6 0.5 0.4 18 4.6 0.5 3.0 1.0 1.6 4.8 0.4 3.6 0.9 1.2 19 4.8 0.4. 1.4 0.5 3.4 3.8 0.8 1.4 0.5 2.4
20 4.4 0.5 3.2 1.3 1.2 3.4 1.5 3.6 1.1 -0.2
21 2.8 0.4 1.8 0.4 1.0 2.8 0.8 2.6 0.9 0.2 22 2.2 0.8 1.2 0.4 1.0 2.0 0.7 2.2 1.1 -0.2
23 4.2 0.4 2.4 1.1 1.8 4.2 0.4 2.8 0.8 1.4 24 3.0 0.7 1.2 0.4 1.8 2.6 0.9 1.8 0.8 0.8 25 5.0 0.0 3.0 1.0 2.0 3.2 1.3 4.0 0.7 -0.8
26 4.4 0.9 2.8 0.8 1.6 4.6 0.5 3.0 0.7 1.6 27 3.2 0.4 1.2 0.4 2.0 2.8 0.8 1.4 0.5 1.4 28 4.6 0.5 2.6 1.3 2.0 3.8 1.3 3.6 0.5 0.2 29 2.6 1.1 2.8 1.3 -0.2 3.2 0.8 4.6 0.5 -1.4
30 2.4 0.9 2.8 2.5 -0.4 2.8 0.4 4.2 0.4 -1.4
31 1.6 0.9 2.6 1.5 -1.0 2.2 1.3 2.4 1.3 -0.2

X 1.2 0.5
SD 1.1 1.3








86

A paired t-test was used to test the difference in the responses for the first and second phonations of each parameter. The results clearly indicate (P<0.01) that the first cry is perceived as rougher (t=2.5, P=0.009) and more strained (t=6, P=0.00001) than the second.

While the acoustic and temporal analyses quantify various aspects of the cry, those components that are not measured but are perceptible must be acknowledged as well. For example, a rating of phonational noise provides an index of the relative size of the noise component within the signal, and adds a dimension to data which might otherwise be indistinguishible during comparison procedures. The finding that first cries were perceived as both rougher and more strained than second cries impinges on the frequency information. Specifically, these perceptual data indicate that, in spite of similar FO values, phonational patterns for first cries are different than for second cries. That difference may be understood in terms of irregular vocal fold vibration, and greater tension in the laryngeal mechanism.



Summary

The data generated for this project are summarized below for each vector.

Fundamental frequency

The group mean (and mode) for cry fundamental frequency was found to be 505 Hz, with a 3 ST standard deviation. The range of individual means extended over two octaves from 312 to 1299 Hz. The range of detected frequencies within individual cries averaged 15 ST (SD=9 ST), or over one octave.








87

The cries clustered into two frequency ranges--most (93%) were

below 600 Hz (7=465, SD=3.0), while a few cries (7%) were above 900 Hz (X=1080, SD=2.2). Of the four hyperphonational cries, three occurred as first phonations and one occurred as a second phonation following a hyperphonational first phonation.

No significant differences between first and second phonations

were found for any of the frequency parameters. The paired comparison of standard deviation values showed the strongest difference, achieving a level of only P=0.12. Jitter

Values for percent of cry jitter ranged from 2.2% to 25.3%

(2=9.0%, SD=4.0%), and was probably related to the level of turbulence or aperiodicity during phonation. First and second phonations differed significantly in percent of jitter, although this relationship was not strong (P(0.10). Formant Frequencies

Mean and standard deviations (respectively) obtained for cry formant frequencies were: Fl=929, 126; F2=1347, 214; F3=2235, 290; F4=3631, 452; F5=6043, 853. First phonation values were found to be significantly higher than second phonation values only for Fl (P <

0.01), indicating that the vocal tract configuration was approximately stable over time.

Long Term Spectral Distribution

Although individual spectral curves tended to vary substantially, a plot of averaged spectral data identified six energy peaks, centered approximately at 460, 940, 1190, 2140, 3400 and 6080 Hz. In general, the cries had a spectrum with most energy concentrated between 830 and








88

2700 Hz. Energy level tended to be most variable and to drop off rapidly at higher frequencies (approximately 3600 Hz and above).

The spectral curves of first and second phonation cries were

nearly congruent in the central frequency region (between Fl and F3). However, P1 exhibited greater spectral energy levels in the lower and upper frequencies. The differences in low frequency energy probably were reflective of increased aperiodicity and lower signal-to-noise ratios for P1, while high frequency differences were likely due to changes in mouth opening.

Peak Intensity Level

Peak absolute intensity levels for each cry ranged from 77.5 to 99.0 dB (f=88.8, SD=5.5) 24 inches away from the microphone. No difference was found in this parameter due to phonation sequence. Timing

Latency. After the application of the stimulus, about half of

the infants responded with.an average of 2.4 (SD=1.8) non-phonational vocalizations (a "cough" or inspiratory burst); the other half responded immediately with the first phonation. Independently of the manner of response, the latency time (X=1.6 seconds, SD=0.87 seconds) to onset was approximately the same. If the infant did respond first with a non-phonational vocalization, then approximately the same amount of time (7=1.83 seconds, SD=1.91) elapsed from that first vocalization to the first phonation, although this second segment of time tended to be more more variable.

Duration. The first phonation within a cry episode averaged

4.3 seconds (SD=2.9), while the second phonation averaged 1.5 seconds (SD=0.85). The time lapse between the first and second phonations








89

averaged 0.9 seconds (SD=1.1). Approximately one third of the infants produced inspiratory bursts during this interval. Perceptual Judgements

Author's judgements of cry features. More than half the cries

(34) were judged to have some level of roughness. Of that number, five were judged as extremely rough. Nearly half of the cries (41%) exhibited vocal fry during some portion of the cry. Approximately one fifth of the cries were rated as extremely variable in frequency. Finally, six of the 61 cries (9.8%) exhibited temporary shifts in register during phonation. Five of those cries were upward shifts from (modal to loft), while one was a downward shift (loft to modal).

Each of these four features was more pronounced for the first phonation than for the second: roughness was noted slightly more often, fry occurred more than twice as often, frequency variation more than three times as often, and register shifts twice as often. Both second phonation register shifts followed high frequency first phonations.

Panel's judgements of roughness and strain. Five highly

trained voice professionals rated first cries as consistently rougher and more strained than second cries within an episode.




Full Text

PAGE 1

ACOUSTIC/TEMPORAL CHARACTERISTICS OF THE PAIN CRIES OF NORMAL NEONATES By BRIAN ROSS KLEPPER A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLNENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1 9 35

PAGE 2

ACKNOWLEDGEMENTS It has been my extreme good fortune to undergo the training for this degree in the company of talented professionals, good friends and family. This project, and indeed my entire academic career, has been patiently endured and urged on by many people. Hy wife, Merlynn, has rowed with me through it all, has reLlembered the purposes, and believed in the worth of the crossing. My parents have remained steadfastly supportive and ready to help at every moment. From them I have learned patience, compassion and the value of family. Bob never failed to provide insight and analysis, seeing into the heart of structure and context. His friendship, great integrity and compassion have framed every situation of my adult life. Dr. Susan Armstrong generously served as Brazelton evaluator and provided a developmental perspective that was invaluable. Her constant support and devotion to the ideas embodied by this study truly allowed it to come to fruition. Dr. Gene Cranston Anderson's orientation to the dynamics of crying facilitated a leap in my own unJerstanding 0 the implications of stress and infant vulnerability. Ors. Ken Gerhardt, Howard Rothman, Sam Brown, Keith Berg and Richard Silverman have always been available for good counsel and support. Through their clear approaches to science and scholarship, each enriched my academic experience. ii

PAGE 3

The j:>eople of IASCP--Kathy Farl~y, Norr a an Green, Stephanie Baldwin, Dr. Jir.1 Hicks, Dr. Don Teas, Dr. Paul Hoore and Dr. Kathy Berg--have been helpful and cooperative friends for m y entire stay here. Finaliy, Dr. Harry Hollien first precipitaced m y co~ing to graduate school and saw me through to final closure. To each of these people, I give my h~artfelt thanks. iii

PAGE 4

TABLE OF CONTENTS ACIUJOWLEDGEHENTS LIST OF TABLES LIST OF FIGURES. ABSTRACT CHAPTERS I INTRODUCTION PerspectiVeti .......... The Physiological Dynamics of Crying Evaluation of the Cry Signal Perceptual Analyses of Cry .. Infor m ation conveyed by the cry. Perceptions of health state from Acoustic / Temporal Studies of Cry. Fundamental frequency (FO) Jitter . Formant frequencies .... the cry. Spectral distribution of energy level. Intensity analyses of cry Temporal analyses of cry Latency Duration Sumriary of Previous Studies. Specific Aims II METHOD .. Overview Subject Selection. Procedure ... Deter u iination of Nor m ality The parent interview .. Appropriate fetal growth Developmental status Collection of Cry Samples. Elicitation of the pain cry Recording procedures ... Analysis Procedures ..... Acoustic and temporal analyses Cryin g fundamental frequency Vocal jitter ....... Formant frequencies ..... Long term spectral analysis iv Page ii v ii i x X 1 1 3 7 7 8 9 12 14 21 22 25 26 28 28 31 33 34 38 38 38 40 40 40 41 42 43 43 43 44 44 44 46 48 49

PAGE 5

Intensity .... Ti m in g ..... Perce p tual anal y ses. Statistical Ana ly sis Proceudres. III RESULTS ........ IV A cou s tica l Anal y ses .. F un d a w ental F re q uenc y Jitt e r ....... For m ant Frequencies .. Lon g Ter m Spectral Distribution. Intensity .... Temporal Analyses. Latency .. Duration .. The Perceptual Analyses. Author's judgements of "cry qualit y" characteristics Roughness Vocal fry Variations in frequency. Register shifts ... Perceptions of roughness and strain. Stlmlllary. . . . Fundamental Frequency. Jitter Formant Frequencies .. Long Term Spectra Distribution. Peak Intensity Level Timing .. Latency. Duration Perceptual Judge m ents. Author's judgements of cry features Panel's judgements of roughness and strain DISCUSSION Basic Hypotheses Fundamental Frequency Production Jitter Formant Frequencies. Long Term Spectra. Intensity. Latency. Duration. Other Issues Phonation For m ant frequencies. Peak intensit y level and duration. The Physiolo g y of the Pain Cry. Frequency ....... N oise. .. A Model of Cry Production. The U se of the Pain Sti m ulus--I mp lications V 49 so 5 3 54 56 56 5 6 65 66 69 77 77 77 82 8 3 8 3 8 3 8 3 8 3 8 4 8 4 8 6 86 87 8 7 8 7 88 8 8 88 83 8 9 8 9 8 9 90 90 9 0 94 95 98 98 99 101 101 1 0 2 1 0 4 1 0 4 106 106 1 06 1 08 1 0 9

PAGE 6

Cry and Pathology--A Perspective CONCLUSIONS ............ APPENDICES A COMPARISONS OF THE CRIES OF NORHAL AND PATHOLOGIC A LLY INVOLVED HJF AN TS .. 111 112 115 B THE PHYSIOLOGY OF CRY AND ACOUSTIC / TE M PORAL COR R EL A TES 117 C LETTER TO LOCAL PEDIATRICIANS D LETTER TO PARENTS OF PROSPECTIVE SUBJECTS. E THE PONDERAL INDEX. 120 1 2 1 123 F THE BRAZELTON NEONATAL BEHAVIORAL ASSESSMENT SCALE 1 2 4 Curriculu m Vitae--Susan Ar m strong, Ph.D. 127 G INFORMED CONSENT FORM. . 1 29 H TAPE RECORDER CALIBRATION FOR INTENSITY MEASUREMENTS 132 I SOMATIC DATA FOR INDIVIDUAL INFANTS. . 133 J BNBAS (DEVELOPMENTAL) DATA FOR INDIVIDUAL INFANTS. 134 K LONG TERM POWER SPECTRAL VALUES FROM INDIVIDUAL CRIES. 138 L PERCEPTUAL JUDGEMENTS OF CRY REFERENCES BIOGRAPICAL SKETCH. vi 145 143 157

PAGE 7

TABLE 1-1 1-2 1-3 1-4 1-5 3-1 LIST OF TABLES Studies of funda m ental frequency of nor @ al neonatal cr y . . . . .... Studies of foroant frequencies of normal neonatal cry. Studies of intensity of normal neonatal cry. Studies of latency of normal neonatal cry. Studies of duration of normal neonatal cry Fundamental frequency values for first phonation (Pl) and second phonation (P2) cries ... 3-2 Corroboration of FFI fundamental frequency (FO) values 3-3 3-4 3-5 3-6 3-7 3-8 3-9 Summary table of frequency parameters. Results of paired difference t-tests between selected frequency parameters of phonation 1 and phonaton 2 Formant frequency values for each cry sample Results of paired difference t-tests between formant frequency values of phonation 1 and phonation 2 .. Averaged long term spectral data for all cry samples Peak absolute intensity values (in dB SPL) for each cry phonation ..... Timing data for individual subjects. 3-10 Perceptual responses of five trained listeners to each cry. . . 4-1 Comparison of fundamental frequency values for normal 15 23 27 30 32 5 8 60 62 64 67 70 72 78 79 85 pain cry in previous studies and in this study. 91 4-2 Comparison of mean formant frequency values obtained in this study with those from previous studies. 97 I-1 Somatic data for individual subjects 133 vii

PAGE 8

I-2 J-1 J-2 J-3 J-4 Brazelton scores for individual subjects Long term spectral for each of the cry samples The author's perceptual judgments of the cries relative to four parameters of "vocal quality. Individual and mean perceptual responses to strain on each of the cry samples . . Individual and mean perceptual responses to "roughness on each of the cry samples . . viii 134 138 145 146 147

PAGE 9

LIST OF FIGURES Figure 1-1 Schematic of the physiological systems underlying human sound production. . 4 2-1 Schematic of the neonatal pain cry. 51 3-1 Scatterplot of mean fundamental frequency values (in khz) for each cry sample . . 61 3-2 Averaged long term spectral curve for all phonations. 73 3-3 Comparison of the averaged long term spectral curves for the first and second phonation. . 74 3-4 Averaged long term spectral curve for first phonation 75 3-5 Averaged long term spectral curve for second phonation. 75 ix

PAGE 10

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ACOUSTICiTEMPORAL CHARACTERISTICS OF THE PAIN CRIES OF NORMAL NEONATES By Brian Ross Klepper Chairman: Harry Hollien, Ph.D. Major Department: Speech Hay 1985 The cry of the newborn infant is a complex si g nal which, throu g h its cohlponents, reflects a host of underlying physiological processes. In order for the cry to be meaningfully studied, it first is necessar y to describe it operationally. Earlier studies have yielded data for a broad range of para m eters. However, few studies have been carried out which have applied even a high proportion of these analyses to a single data base. Consequently, acoustic and temporal (A/T) values were obtained for the cries of normal neonates in order to enhance present understanding and use of the cry. Pain cries were recorded from 31 infants who were 26 to 30 days of age, had been full term, had uneventful medical histories, and who were shown to be within normal limits developmentall y (using the Brazelton Neonatal Behavioral Assessment Scale ) and somatically (using the Ponderal Inde x) Si x A/T analyses were applied to determine values for fundamental frequency, jitter, forCTant frequenc y lon g ter hl X

PAGE 11

spectra, intensity, latency, and duration. In addition, a panel of trained listeners rated each cry for levels of "strain" and "roughness." While the preponderance of cries were low frequenc y (m odal) in nature (around 400-530 Hz), a few normal cries were also hi g h frequency (hyperphonational). High frequency phonation most often occurred immediately after stimulus application, and only for certain infants. This suggests that the generalized muscular contractions and heightened laryngeal tensions produced by the stimuls are reflected through cry frequency registration, and may provide an index of the infant's vulnerability to stress. Furthermore, the ability to recover to a "modal" cry may provide an index of the infant's ability to adapt to the stimulus and achieve homeostasis. In addition, latency values were found to be stable independent of the manner of cry response, indicating that this parameter may be a useful reflection of neuological constancy in normals. Finally, the extreme characteristics of first phonations relative to second (across most parameters) suggest that while early portions of induced cry episode are stimulus-bound reflecting characteristics of the response, later phonation evolve into a basic cry mode," and probably are stimulus-independant. xi

PAGE 12

CHAPTER I INTRODUCTION Perspectives The cry of the neonate is an expression to the outside world of dissatisfaction with an external or internal situation. Crying is a holistic, all encompassing state, which not only communicates a response or need, but also reflects underlying physiology. The baby vocalizes, tenses the muscles throughout his body (Bosma et al., 1966) and undergoes a range of systemic physiological changes (Anderson, 1984). Thought to be principally mediated by the periaqueductal region of the midbrain (Ploog, 1981), and under partial control of the autonomic nervous system (Lenneberg, 1967; Lieberman et al., 1971; Lester and Zeskind, 1982), the newborn cry is characterized by rhythmic sets of motorically complex reflexive acts that respond to environmental stimuli encounter~d at birth (Osgood, 1953). In addition, the cry is an acoustic and temporal (A/T) event which, independently of intention or meaning, reflects specific physiological processes. The nervous system (CNS) provides central control, the respiratory system powers the cry, the larynx operates to produce a source tone and the supra-glottal tract modulates the resulting signal into a final acoustic event (Fant, 1970). Over the last half century, investigators have examined the perceptual and A/T characteristics of infant cry; research on the 1

PAGE 13

2 relationship between the cry characteristics of normal and pathologically involved babies has formed the nucleus of this thrust (See Appendix A). In addition, some work on the cry characteristics of different classes of normal infants has been carried out. Many significant contributions have been made by investigators in this area. However, as the discussion below will demonstrate, there have been problems at times with the fundamental assumptions and procedures utilized in these studies. Consequently, new protocols are needed so that the range of A/T cry characteristics may be described for normal neonates. Through the years, researchers hav~ obtained different types of vocalizations from infants over a wide range of ages. Therefore, it is appropriate to define the ages and vocalizations used in this discussion. As defined by the neonatological and the developmental psychological literatures, a neonate is an infant between birth and four. weeks of age (Brazelton, 1981; Stratton, 1982). Infancy has been defined as a period extending between birth and 104 weeks (Crystal, 1973), but is applied here as a general term. Since the focus of this project was limited to the cry of the neonate, the terms "neonate,""newborn,""infant" or "baby" are used interchangeably. In addition to crying, neonates exhibit a large number of vocal behaviors (whimpers, grunts, sneezes, sucking sounds, coughs, coos, etc.) which may properly be referred to as "vegetative" sounds (Hollien, 1980). While vegetative sounds may form a significant basis for later vocal behaviors (e.g., Stark et al., 1975; Nakazima, 1980) they are not as consistently patterned as are cries, nor as easily

PAGE 14

3 elicited or controlled. Consequently, neonatal vegetative sounds will not be considered in this project. The Physiological Dynamics of Crying Crying is a synergistic event, controlled by sets of neural commands which coordinate myriad programmed responses (Ploog, 1981). Like most sounds produced by humans, the vocal dimension of the neonate's cry results from the contributions of four physiological systems (See Figure 1-1). The nervous system provides central control for each of the other systems, synchronizing and coordinating their actions into a stylized form. The respiratory system is the power source for the sound production mechanism, creating pressure in the lungs which drives an airstream through the glottis. The larynx provides a mechanism for variable resistance to the air stream, resulting in the vibratory action of the vocal folds, and in turn, a source tone. Finally, the supra-glottal tract, comprised of several resonant chambers, acts to modify and "shape" the acoustic wave. A more detailed discussion of these four systems and assumptions as to their correlates in the cry may be found in Appendix B. Although the systems cited above are essential to a vocal cry response, each has a fundamental biological function in the maintenance of the organism. Consequently, the physiological responses associated with survival may interact with and significantly affect the production of the vocal signal. Obviously, the nervous system does not exist simply to control the cry, but facilitates information processing and transfer for all activities of the individual. The respiratory system promotes oxygenation of and carbon

PAGE 15

. RESPIRATORY SYSTEM { power sou1ce) -CENTRAL NERVOUS SYSTEM central control l .. ARYN(iE:A!_ cyc .. r1:.:M q,) ffl ,, 111 .11 1'1 (sollrc<:~ tone product ic>n) SUPRA .. GLOTTAL l1ACT ( scJurcrJ tone rr1bc : lulatic,n) Fi g ure 1-1. Schematic of the physiolo g ical s y s t em.s u n derlyin g lrnwan s ound production.

PAGE 16

5 dioxide removal from the blood. The larynx is essentially a respiratory valve which prevents air from escaping the lungs and foreign substances from entering the airway. Finally, the supra-glottal tract establishes communication between the digestive and respiratory tracts and the exterior (Zemlin, 1981). In addition, there are a host of other systemic changes which accompany crying: bio-chemical (e.g., Woodson et al., 1981), cardiovascular (e.g., Vaughn and Sroufe, 1979; Anderson, 1978, 1984), respiratory (Karlberg, 1960; Long and Hull, 1961; Klaus et al., 1963; Vyas et al., 1981), state-transitional (Ruja, 1948; Thoman, 1975), and motoric (Bosma et al., 1966; Bosma, 1975). Thus, a spectrum of specific physiological activities is associated with crying, and the characteristics of each of these activities may interact with and affect the dynamics of cry vocalizations. Anderson (1984) has argued that continuous or severe crying exerts potentially dangerous loads on the infant's physiological systens. For example, the generalized muscular tension which occurs during crying (Bosma et al., 1966) manifests locally in constriction of the laryngeal musculature (possibly for protection of the airway) and often results in periods of partial or complete glottal closure. The increase in airway resistance is complicated by increased contraction in the muscles of expiration. This juxtaposition of events constitutes a modified Valsalva maneuver. A radical increase in abdominal pressure occurs, raising the potential for abdominal hernia. Also, release from this strained posture may produce severe changes in blood pressure, which may result in recirculation of poorly

PAGE 17

6 oxygenated blood as well as fluctuations in cerebral blood flow velocity, all circumstances condusive to intraventricular hemorrhage. It is reasonable to speculate as well that the dynamics associated with stress and the Valsalva maneuver (especially increased laryngeal muscle activity) result in specific vocal changes, including increases in frequency (from increased tension in the vocal folds), intensity level (from increased air flow during exhalation) and latency to the first phonation (from the interrupting influence of glottal closure). Several authors (Truby and Lind, 1966; Wolff, 1969; Lester and Zeskind, 1982) have noted that changes in muscle tension appear to correlate with changes in cry features through the course of the cry episode, and have speculated as to the different patterns of neuological innervation which underlie the motoric involvement of the different cry types. In their model of crying, Lester and Zeskind (1982) have hypothesized that a stressful stimulus initiates eneralized and immediate inputs to the body by the sympathetic division (SNS) of the autonomic nervous system (ANS). In the larynx, this generalized input results in constriction of the intrinsic (and possibly the extrinsic) laryngeal musculature, with consequent lengthening and tensing of the vocal folds. Subsequently, the parasympathetic wing (PNS) of the ANS moderates the laryngeal tightening by providing discrete, inhibitory vagal input to the cricothyroid muscle, which is the principal muscle of laryngeal tension and frequency change. This model describes a servomechanism in which the immediate biological response is directly accountable to the initiating stimulus, but which over time achieves homeostasis and consequently is not tied to the stimulus. The vocal aspect of that

PAGE 18

7 response, the cry, follows the same pattern. That is, the onset of crying is likely to reflect characteristics of the stimulus (e.g., due to pain or hunger) while the basic or "modal" cry will be representative of the eventual return to "balance" within the system. Evaluation of the Cry Signal As described above, much research has been carried out on the physiological correlates of crying. However, many investigations have utilized the acoustic cry signal, and have focused on characteristics of frequency (pitch), wave composition (quality), intensity (loudness) or timing. Studies of cry have used two principal approaches: 1) perceptual analyses and 2) A/T analyses. The first approach (perceptual analyses) involves listening to live or recorded cries in order to make judgements about them. The second (A/T analyses) utilizes various instrumentational analyses to examine the frequency, intensity or timing characteristics of the recorded cry event. Perceptual Analyses of Cry Historically, many important descriptions of infant cry have been obtained from simple and careful observation. In an early experiment in bioacoustics, William Gardiner (1838) provided musical descriptions of various sounds in nature, including the cries of a child. Darwin (1872) discussed crying in his volume on the emotions of animals and man, using line drawings and photographs to describe the facial expressions and movements that accompany the cry. The invention of sound recording enabled listeners to repeatedly examine transient events. In 1906, Flatau and Gutzmann used this

PAGE 19

8 process to record and study the cries of 30 infants. They estimated the pitch of the expiratory cry tones to be from G4 (392 Hz) to E7 (2637 Hz), a range of almost 3 octaves--and a reasonable approximation of range of fundamental frequency in infants as described in later analyses (See Table 1-1). Information conveyed by the cry. A good deal of perceptual cry research has centered on the kinds of information conveyed by the newborn's cry. For example, a number of investigators have used listener's judgements to examine the premise that the "quality" of a newborn cry is influenced by the cry stimulus (e.g., pain, hunger, startle). While Sherman (1927), Aldrich et al. ( 1 945) and Muller et al. (1974) report little evidence to support this assumption, Wasz-Hockert et al. (1964), Wiesenfeld et al. (1981) and Sagi (1981) have provided data suggesting that individuals who have experience with infants (e.g., nurses, mothers) can distinguish between cries resulting from different stimuli better than less experienced subjects (fathers, non-mothers). In this regard, both Wolff (1969) and Bell and Ainsworth (1972) have argued that cries of differing acoustical qualities result in different responses from parents. However, as Hollien (1980) notes, the cry may simply be an alerting signal, and the caretaker may draw additional information about the stimulus from clues in the environment (time of day, infant history, etc.). A number of experiments have been carried out to determine whether an individual infant's cries are so distinctive that the parent can differentiate that baby from other infant's cries. The data here appear to be consistent--mothers (and fathers) are generally

PAGE 20

9 able to identify their own baby as early as three days after birth from the cry alone (Formby, 1967; Murry et al., 1975; Morsbach and Bunting, 1979; Wiesenfeld et al., 1981). Perceptions of health state from the cry. Physicians traditionally have assumed that the cries of healthy babies can be perceptually distinguished from those of unhealthy infants. M oreover, pediatricians have long been trained to listen to the quality of the cry as a clue to diagnosis (Parmalee, 1962). On this subject, Illingworth (1955) wrote, a clinician recognizes the hoarse, gruff cry of a cretin, the hoarse cry of laryngitis, the shrill cry of hydrocephalus, meningitis, or cerebral irritability, the grunting cry of pneumonia, the feeble cry of amyotonia or of a severely debilitated infant. (p. 76) Currently, several neonatal assessment examinations (e.g., Brazelton, 1976; Dubowitz and Dubowitz, 1981) use the evaluator's judgements of the latency and quality of the newborn's cry to determine the relative development and stability of the infant (Vaughn and McKay, 1975). Several authors have pointed out the danger of lending such a subjective measure too much weight in the diagnostic process (e.g., Truby and Lind, 1965; Ostwald, 1972). In a short essay on crying and diagnostics, Parmalee (1962) warned, "Despite the commonness of these observations in medical practice, there have been very few systematic studies of crying (p. 801)." Vuorenkoski et al. (1971) tested physicians and medical students with recorded cries from normal and pathological infants, and reported that both groups did quite well in differentiatin g nor m al cries from abnormal. Further, more information about the acoustic (spectrographic) correlates of particular conditions facilitated a

PAGE 21

10 rise in correct identification rates. But while the authors pointed to the success of their training sessions, they also cautioned that the "relatively good results obtained in this study do not correlate in reality with the practical possibilities in clinical conditions" They noted that the test cries were culled as the best examples of specific conditions from several recordings of the same babies. In addition, subjects were allowed to listen to recordings repeatedly in order to make their judgements, while competing noises in a clinical situation would make the process considerably more difficult. The ability of listeners to discriminate between cries is not limited only to normal/abnormal comparisons. It may extend as well to differences among infants who have been classified as normal. Zeskind and Lester (1978) utilized a sample of normal infants, but reclassified them as "high complication" or "low complication," based on scores of 1-2 or 5-9 complications respectively on Prechtl's (1977) scale of 42 optimal maternal, parturitional and fetal conditions. Low scores resulting from the absence of optimal conditions have been used as indicators of the degree of risk to the infant's nervous system (Prechtl, 1968). The experimenters presented the cries in a perceptual test to parents and non-parents. Subjects asked to rate the cries on an eight point polarized scale found the high-complication cries to be significantly more "negative" on all eight points than the low complication cries. Thus, this study demonstrated perceptible differences between "highand low-risk normal" infants, and examined the relationship between the cry and an emotional response rendered from the listeners.

PAGE 22

11 Several other workers also have explored the emotional reactions of listeners to cries of healthy or abnormal babies. Frodi et al. (1978) reported that parents found the cries of premature infants more aversive than those of term newborns. Interestingly, Freudenberg et al. (1978) found that normal newborns produced a cry which was more grating to adult listeners than the cries of Down syndrome infants. On this topic, Lester (1978) has speculated that abnormally sounding cries might profoundly affect the caregiver. He wrote: the cry of the infant at risk and the damaged or diseased infant may be unique in order to elicit special caregiving that may facilitate survival [Conversely,] in a non-supportive environment, the behavior repertoire of the poorly organized infant with a cry that is perceived as grating and aversive may violate the limits of caregiver control behavior and suppress the optimal caregiving pattern necessary to facilitate the recovery of the baby. (p. 128) Thus, characteristics of the cry carry information about underlying physiological processes, and the perceptual effects of those characteristics may have significant consequences in the relationship between the infant and his caregiver. To summarize, the results of perceptual cry studies have demonstrated that listeners can perceive differences both between the cries of normal and abnormal infants, and between the cries of different classes of normal newborns. These data 1) lend credence to the postulate that different physiological states may be reflected in characteristic cry patterns, and 2) suggest that infants can be misclassified as "normal" and might not obtain that rating if more stringent classification schemes were applied. Indeed, as Lester (1978) notes, use of traditional classification schemes increases the probability that cries of infants from certain "normal classes" (or from one end of the "range of normality") might be perceptually

PAGE 23

12 confused with infants known to be abnormal. In addition, some evidence has been found to support the notion that different cry types elicit different caregiving behaviors. Such findings are not surprising, and suggest that humans have evolved an inherent notion of the nature of "appropriate" and "inappropriate" cry signals. Finally, it should be noted that, almost by definition, the resolution available in perceptual cry studies is severely limited. Listeners can identify differences between groups of cries, but it is difficult to ascertain which qualities of the cry provide the cues for differentiation. Thus, unless a viable taxonomy is developed to specify cry configurations for a full range of normal and pathological neonatal conditions, perceptual analysis of the cry may only serve as a gross indicator of status, and one that is easily open to misinterpretation. Acoustic/Temporal Studies of Cry Instead of relying on a perceptual "Gestalt" of the cry, many investigators have employed A/T analysis techniques to characterize che cry and its components. Workers in this area have generally utilized a common conceptual framework to define the source phenomena to which the components of the cry may be related. Even so, a broad range of procedures and parameters have been employed in feature extraction and a substantial corpus of data obtained. A few investigators have devised complex multi-parametric systems of cry analysis. In an early scheme, the Wasz-Hockert group utilized a host of cry parameters which were linked to spectrographic patterns. Lester and Zeskind (1982) refined this spectrographic analysis scheme,

PAGE 24

13 eliminating some parameters and dividing the rest into features of duration, harmonics and quality. However, the difficulties involved in obtaining accurate acoustic/temporal cry data have still proved to be formidable, and until recently most investigators concentrated on only one or two parameters. Even the most ambitious (e.g., Colton and Steinschneider, 1980) have examined fewer than ten. The application of A/T analysis procedures to acoustic signals assumes that any signal may be defined (i.e., quantified) in terms of its frequency, intensity and temporal characteristics. Moreover, because each aspect of the acoustic wave represents the contribution(s) of physical phenomena, the signal components may be assumed to be reflective of specific source (i.e., physiological) events. However, it also has been shown that the human sound production mechanism is comprised of interacting systems. Consequently, the components of the cry signal also reflect the interactive dynamics of the sound production process (e.g., Golub, 1980). Over the last 40 years, substantial effort has been focused on the measurement of certain acoustic and temporal patterns of the neonatal cry. While much of this research centered on establishing differences between the cries of normal and abnormal infants, some studies also have been carried out to define the acoustic / temporal characteristics of normal infants. Some of the major findings of those studies are reviewed below. Only studies of infants four weeks of age or less are included.

PAGE 25

14 Fundamental frequency (F0). The A/T cry parameter studied most thoroughly to date is fundamental frequency. Data for over 600 infants have been reported, and a mean F0 value of slightly below 500 Hz may be distilled from the often disparate results of those studies. Table 1-1 presents the summarized data from investigations of F0 in normal neonates. A number of different tools have been used to extract fundamental frequency from the cry signal. Early studies of the F0 of cry (Fairbanks, 1942; Michel, 1961) utilized phonophotography; each laryngeal cycle was measured using a photograph of an oscilloscopic trace of the cry signal; the results were quite accurate, although the data reduction procedure was lengthy by contemporary standards. An easier tool to work with and, to date, the principal analysis technique employed to obtain F0 has been the time-frequency-amplitude (t-f-a) sound spectrograph. Although this device provides an adequate tool for estimating the timing of cries, it does not have adequate resolution to facilitate more than gross measurements of this important parameter. In the narrow band mode (filter bandwidth--45 Hz), measurements of the fundamental must be made by estimating the frequency of the nth harmonic overtone, and dividing by n. In the wide band mode (filter bandwidth--300 Hz), low frequency glottal pulses may be resolved enough for identification and counting, but pulses occurring with high frequency phonation tend to blur together, impeding accurate calculation. Thus, using spectrographic anal y sis it is difficult to determine the fundamental frequency characteristics of a cry vocalization for more than a small portion of the signal at a time, since the representation of phonation is usually interspersed

PAGE 26

15 Table 1-1 -Studies of fundamental frequency of normal neonatal cry. N Class. Age FO (days) SD l.Flatau & Gutzmann,1906 2.Fairbanks,1942 3.Michel,1961 4.Ringel & Kluppel, 1964 5.Sheppard & Lane,1968 6.Wasz-Hockert et al. ,1968 7.Lieberman et al. ,1971 8.Michelsson,1971 9.Tenold et al., 1974 30 0-35 1 m 0-30 1 m 10 1 f 13 10 4m/6f 1-2 1 m 1-9 13-21 25-33 1 f 1-9 13-21 25-33 77 0-39 60 0-30 75 0-30 1 f 0 50 0-10 9 2 10.0stwald,1974 5 ll.Prescott,1975 4 1-10 12.Lester,1976 12 flwt 2 12 undwt 2 13.Sirvio & 50 Michelsson,1976 14.Kittel 50 0 & Hecht,1977 373 289 352 413 30.05 438 411 404 -/, 401 384 401 500 530 470 400 620 110 518 384 308 479 425 38 32 26 Range 392-2637 153-888 290-508 * 450-550 410-650 390-550 400-600 390-620 Stimulus spont. hunger hunger hunger pain birth pain hunger birth pain pain spont. pain pain birth ---------------------------------------------------------------------

PAGE 27

16 Table 1-1 -continued. ---------------------------------------------------------------------N Class. Age F0 SD Range Stimulus (days) -------------------------------------------------------------------15.Lester,1978 12 flwt 2 466 83 pain 12 undwt 2 740 176 pain 16.Zeskind & 12 lcomp 2 468 54 pain Lester,1978 12 hcomp 2 814 263 pain 17.Lester & 40 2 606 302 pain Zeskind,1978 18.Colton & Stein33 m 7 503 92 handling schneider,1980 33 f 7 523 101 handling 19.Gardosik et 53 m 0 467 79 327-669 birth al. ,1980 50 f 0 458 65 356-664 birth 20.Zeskind,1981 13 ave-PI** 2 524 29 440-800 pain 13 low-PI 2 1140 603 410-2000 pain 13 hi-PI 2 991 576 410-2000 pain 21.Zeskind & 19 ave-PI 2 488 100 pain Lester,1981 19 low-PI 2 665 227 pain 19 hi-PI 2 665 277 pain Stimulus N X(F0)-Hz Pain 236 536 Birth 231 467 Hunger 78 465 Spont. 70 506 All Cries 615 498 *Not presented **Pondera! Index (see Appendix E)

PAGE 28

17 between periods of noise or quiet. Calculations of the mean, standard deviation, mode or other parameters are extremely difficult to make. Indeed, Michelsson (1980) notes that the temporal instability of F0 in the crying infant resulted in Wasz-Hockert et al. 's concentration on the specification of "maximum pitch,""minimum pitch" and "general pitch'' (i e., the "dominating pitch level in the cry"), rather than a mean value. Perhaps the most accurate determinations to date of neonatal F0 have employed digital spectrum analyzers (Lester 1976; Zeskind and Lester 1978; Lester and Zeskind, 1978; Gardosik et al., 1980), which sample the cry signal at rates from 10,000-20,000 samples/second. Frequency by amplitude histograms are developed for each bandwidth in the analyzed portion of the spectrum and the first peak on the output histogram is "assumed to be the F0 for that sample" (Gardosik et al., 1980). However, the accuracy of this method is compromised to some degree as well since 1) the resolution of the results are limited by the size of the bandwidths, and 2) histogram peaks do not represent the mean value of the sample, but rather the mode (i.e., the most frequently occurring value). While most analysis procedures have provided only an approximation of F0, precise calculation of cry F0 has been obtained using the IASCP Fundamental Frequency Indicator (FFI) (Murry et al.,1975), although those data are not included in this report since the infants were 3-6 months of age. FFI is discussed in more detail in Chapter II. Another difficulty with the F0 cry values reported to date stems f rom the large variability in data among studies. For e x ample, different investigators have elicited cries from different sti m uli. (The breakdown of mean F0 by stimulus is provided in Table 1-1.)

PAGE 29

18 However, while some differences in FO are apparent (e.g., FO(pain)) FO(hunger)), the results overall are clouded by ambiguities as to the nature of the "spontaneous and "birth" stimuli). In addition, many investigators have used very small subject samples in their studies (e.g., Fairbanks, 1942; M ichel, 1961; Sheppard and Lane, 1968 ; Ostwald, 1974). Not surprisingly, the variability among FO values obtained for these small-sample studies is quite large. Michel (1961) reports an FO value of 289 Hz (49.7 semitones (ST)) for a single ten day old male child, whi l e Sheppard and Lane (1968) report a value of 438 Hz (56.9 ST), 52 % hig h er than Michel's infant, for another male child of nearly the same age. Similar patterns of FO variability are evident for larger samples of infants. For example, Michelsson (1971) reports data for 50 babies at 620 Hz (62.94 ST). By contrast, Wasz-Hockert et al. (1968) report that mean FO values for pain cries of 60 newborns is 530 Hz (60.22 ST) (17% lower than Michelsson's) and Gardosik et al. (1980) arrive at mean values of 462.5 Hz (57.86 ST) for the pain cries of 103 infants (25.5% below Michelsson's data). Such wide variability in the FO values of groups of normal infants seems excessive. The variability of FO within groups of normal infants has been addressed in studies carried out by Lester (1976), Lester and Zeskind (1978), Zeskind and Lester (1978, 1981), and Zesk i nd (1981). These researchers have compared groups of clinically "normal" two day old infants who subsequently were reclassified as low risk" or at risk" using the Ponderal Index (PI) and the Brazelton. They hypothesize that since these tests provide information about conditions which are

PAGE 30

19 indicative of CNS "stress," poor scores on the examinations may be correlated with abnormal cry characteristics (such as F0). Interestingly, significant differences between cry features for the "low risk" and "at risk" groups have been found consistently. The fundamental frequency of healthy infants (with higher PI scores and Brazelton performances) has been generally lower, while underweight infants (with low PI scores and Brazelton performances) have exhibited higher F0s. The data from the Lester and Zeskind studies lead to the conclusion that studies of "normal infants" which have not used such stringent subject selection criteria may have included both "low risk" and "at risk" babies in their samples. Consequently, the investigators in these studies may have obtained mean F0 and standard deviation values that are skewed upward. Finally, the classification by fundamental frequency production according to "cry modes" was introduced by the Wasz-Hockert group (cited in Michelsson et al., 1980) who noted that infants phonate in three distinct modes: "phonation," "hyperphonation" and "dysphonation." They hypothesized that while normal phonation is predominant, 1) tightening of the laryngeal musculature may result in short periods of high frequency production (hyperphonation) and 2) inadequate closure of the vocal folds may produce some turbulence in the signal (dysphonation) (see also Physiological Dynamics of the Cry, above). While the Wasz-Hockert group simply noted the absence or presence of this characteristic in their analysis, Golub (1980) furthered analysis of their parameter by using a "formant tracking" program and linear predictive correlation analysis. He reported that normal infants "normally" phonate during approximately 80 % of the cry,

PAGE 31

20 "hyperphonate during approxiatly 5% of the cry and "dysphonate" during approximately 15% of the cry. In actuality, these classification modes refer to two different kinds of phenomena: 1) vocal registers during crying and 2) the presence or absence of turbulence, noise or aperiodicity during "normal phonation" (since noise does not appear to occur during "hyperphonation ). In addition, cry registration has not only been identified with respect to the frequency of a continuous cry, but in the rapid transition during one phonation from phonation to hyperphonation and back again. This phenomena is known in the phonetics literature as a "voice break," but is referred to in the cr y literature simply as a register shift (e.g., Michelsson, 1980; Lester and Zeskind, 1982). Hollien (1974) defines a vocal register "simply as a series or range of consecutively phonated frequencies which can be produced with nearly identical vocal quality [and with] little or no overlap in fundamental frequency (FO) between adjacent registers (p. 126). He points out that while the idea of vocal registration is used loosely within several disciplines, the identification of a vocal register requires an operational definition 1) perceptually, 2) acoustically, 3) physiologically and 4) aerodynamically. It seems clear that while there is some evidence available (perceptual, acoustic) to substantiate the e x istence of vocal re g isters durin g cr y in g, the physiological and aerodynamic data are insufficient at this juncture. Still, the concept of registers provides a convenient vehicle for the discussion of similar qualities among cries, and consequently the terms appropriate to different registers (from Hollien, 1974) will be

PAGE 32

21 applied throughout the following text. The type of cry which occurs most often will be referred to as "modal." High frequency crying, sometimes referred to in the literature as "flute" or "whistle" (Hollien, 1974) will be referred to as "hyperphonation," "falsetto," or "loft." Low frequency crying, called "glottal pulse" or "creak" in the literature, will be referred to as "pulse" or "fry." Jitter. Vocal jitter is the cycle-to-cycle variation found within successive periods of a laryngeal vibratory pattern; it is a commonly accepted characteristic of human vocalization. Although jitter has not been employed in cry studies thus far, it has been mentioned in the cry literature, and is potentially attractive as a useful measure. Normally, it is assumed that the phonatory sample is approximately stable around a central frequency, since increases or decreases in frequency may tend to bias measurement. Moreover, the phonational samples generally are relatively free of aperiodicty, since the presence of noise, characterized by large (apparently random) frequency changes between adjacent cycles, may inflate jitter values. Jitter has been found to occur in the sustained phonation of young adults. Values of 0.5-1.0% appear to be typical, although the magnitude appears to be dependent upon age, frequency and intensity level of the particular sample and the method of measurement (Horii, 1979; Wilcox, 1978; Hollien et al., 1973). It has been suggested also that the magnitude of jitter is indicative of the general condition of the larynx (Hecker and Kreul, 1971; Murry and Doherty, 1980). Lieberman (1963) proposed that jitter may have utility in describing the stability of infants' laryngeal control. Similarly,

PAGE 33

22 Bosma, Truby and Lind (1965) suggested that an infant's neural developmental status may be evaluated from factors such as the stability of laryngeal coordinations and the mobility of vocal tract components during crying. To date, no studies of neonatal crying have examined jitter. However, it is not unreasonable to assume that jitter levels might be significantly larger for neonates, since frequency production often is quite variable, and since a noise component (characterized by random cycle-to-cycle changes in frequency) often constitutes a substantial portion of cry phonation. Formant frequencies. The specification of formant frequencies during crying has been the least investigated of the cry parameters (See Table 1-2). Most investigators interested in this area have measured the center frequencies of cry formants on t-f-a spectrograms, but Gardosik et al. crosschecked his obtained values through the use of a spectrum analyzer, verifying that spectral energy peaks existed at the expected frequencies. Lieberman et al. (1971) reported that the formant values they obtained were similar to those predicted from the vocal tract transfer function. The major difference between the observed and predicted frequencies occurs at the first formant, where the center frequency is much higher than expected. In general, the findings of each of the four studies carried out in this area are in agreement with one another. However, while it is a relatively simple task to identify spectral peaks, it is difficult to know how those frequencies relate to the production of sound during cry since 1) there is little good information about the anatomical characterstics of the infant

PAGE 34

23 Table 1-2 -Studies of formant frequencies of normal neonatal cry. Study N Class Fl F2 F3 --------------------------------------------------------------------1. Ringel & 10 4m/6f (1400-1500)@ (1700-3200)@ Kluppel,1964 2. Lieberman 1 m llOO 3300 -;'c et al., 1971 3. Colton & 33 m 1592.5 3223.8 5337.2 Steinschneider, (396.8) (585.4) (863.6) 1980 33 f 1653.1 3274.6 5368.2 (322.5) (487.0) (785.8) 4. Gardosik 53 m 1573.2 3106.5 et al., 1980 (937-3281) (1719-437 5) 50 f 1527.0 3lll. 2 (900-2400) (1875-4375) --------------------------------------------------------------------*Not provided @approximately

PAGE 35

24 vocal tract and 2) models relating formant frequencies to adult anatomy may not apply to neonates. Lieberman et al. (1971) used data about the neonatal tongue (Hopkin, 1967) and the positioning of the neonatal larynx (Noback, 1923) relative to adult measurements (Chiba and Kajiyama, 1958), and estimated the newborn's vocal tract size to be 7.5 cm. Colton and Steinschneider (1981) provide no reference for their estimate of 8 cm. Furthermore, as Golub (1980) points out, there are obvious anatomical differences between adults and neonates (e.g., infants have tongues that nearly fill the oral cavity, and large fat pads in their cheeks) which result in different acoustically relevant dimensions such as 1) the ratio of pharyn x length to mouth length or 2) the ratio of nasal tract length to vocal tract length. Cineradiographic data from Bosma, Truby and Lind (1965) indicate that during the crying episode, the supralaryngeal vocal tract is nearly stationary, with the tongue thrust down and forward in the open mouth. Traditionally, the acoustical resonances for the human vocal tract have been modeled as a simple tube (uniform cross sectional area) which is closed at one end (Fant, 1960, Stevens and House, 1961). Thus, the resonant frequencies of the tube are given by the equation (2k+l)C / 41 where C = the velocity of sound (approximately 340 meters / second), L the length of the tube (in ems.) and (2k+l) = the number of the formant in question. Thus, for a vocal tract size of 7.75 cm. (the approximate values used by Lieberman et al. (1971) and by Colton et

PAGE 36

25 al., (1981), Fl-F3 values of approximately 1100, 3300 and 5500 would be predicted. Data approximating the cited values were reported by Lieberman et al. (1971). Specifically, the cries they examined appeared consistent to this relationship; hence they argued that the rigidity of the infant supralaryngeal tract apparently was responsible for the constancy of relationships among the formant values. However, as noted above, while the F2 values reported across all studies have been consistent, the data for Fl generally have been higher than that found by Lieberman et al. Based on the Fl value generated in most studies (approximately 1525 Hz), vocal track length would be estimated to be 5.6 cm, or approximately one-third the vocal track length of an adult male (Zemlin, 1981). Colton and Steinschneider obtained F2 (but not Fl) values that agree with those of Lieberman et al. (approximately 3200 Hz); they appear to believe that these values do reflect the vocal tract resonances based on the uniform tube equation. In turn, they argue that a higher than expected degree of mouth opening is reflected in an elevated Fl value with minimal effect on the other formants (Fant, 1960). Spectral distribution of energy level. Colton and Steinschneider (1980) used fast fourier transform (FFT) techniques to specify the relative energy levels within three spectral bandwidths. At a microphone distance of 9 inches, they found 73 dB between 0.05-4.0 KHz, 64 dB between 4.0-8.0 KHz and 46 dB between 8.0 and 16.0 KHz. Given the relative ease with which these data can be generated, it is surprising that more research has not been carried out with this parameter. Spectral analyses of this type (but with higher

PAGE 37

26 resolution) might be potentially useful to characterize the appropriate energy distributions for normal, healthy infants under a variety of conditions. For example, comparisons of spectral curves from different cry-types might indicate different energy levels in particular frequency intervals, which in turn might suggest the presence or absence of noise within the signal. Intensity analyses of cry. The study of intensity constitutes one of the most challenging problems in bioacoustics, since in many instances, recording procedures are not designed with this parameter in mind. The studies summarized in Table 1-3 represent the few descriptions of the intensity characteristics of the normal newborn cry. A number of investigators have reported their impressions of the relative amplitudes of cry sounds. Ostwald claimed that normal cry sounds are generally 20 dB more intense than other vocal sounds (Ostwald, 1963). Stark and Nathanson (1975) noted that the cries of an infant who had died of Sudden Infant Death Syndrome (SIDS) were about 10 dB lower in intensity than the cries of normal infants. Truby and Lind (1965) noted that the cries of CNS damaged infants are characterized by a greater variability in their output levels. Each of these reports seems reasonable, since it might be expected that healthy infants cry with greater control, consistency and expiratory power than compromised infants. Clearly, there has been a lack of quantitative data about the intensity of the normal cry. Most studies of cry have been carried out in clinical situations where environmental noise is difficult to control. Few studies have actually attempted to calibrate the recording systems other than by trying to maintain the microphone a

PAGE 38

27 Table 1-3 -Studies of intensity of normal neonatal cry. Study 1. Ringel & Kluppel, 1964 2. Sheppard & Lane, 1968 3. Zeskind & Lester,1976 N Age Class. (days) 10 1-2 4m/6f 1 1-9 m 13-21 25-33 1 1-9 f 13-21 25-33 12 2 !comp. 12 2 hcomp. 4. Colton & Stein33 schneider, 1980 33 7 7 m f *Not provided X (dB) 82.13 0.32@ 0.29@ 0.31@ 0.30@ 0.25@ 0.34@ 42.75 46.38 74.34 73.72 @Values represent "coefficient of variability" SD 3.40 * 9.24 8.90 6.79 7.36 Mic.Dist. (inches) 12 * 16 9 9

PAGE 39

28 certain distance from the crying infant. Ringel and Kluppel (1964) suspended a microphone one foot above the child's head, and Colton and Steinschneider (1980) affixed a microphone to the side of the baby's crib, but in neither study were the movements of the child or the general noise level controlled other than by being in a "quiet room." Sheppard and Lane (1968) controlled for noise by recording within "plexiglass air cribs," but make no mention of microphone distance. In addition, they attempted to derive a meaningful measure of relative cry intensity by calculating an "average coefficient of variation within utterances in the amplitude measures" (p. Temporal analyses of cry. The cry event may be thought of as being comprised by several temporal markers--the stimulus application, the first expiratory burst, onset of first phonation, offset of first phonation, onset of second phonation--any of which may be used to designate the elapsed time for sub-events. Two of these sub-events, latency to the first phonation and duration of the first phonation, have been discussed repeatedly in the literature. Latency. Latency is the elapsed time from the application of a cry stimulus (e.g., pain, startle) to the onset of the first expiratory phonation. It may be thought of as consisting of three events: 1) neurological processing time, 2) inhalation time and 3) the time required for expiratory adjustments appropriate to onset. Latency has traditionally included inhalation since it is often difficult to ascertain from the recorded audio signal the precise onset and duration of that event. Neurological processing time refers to the amount of time the organism needs to respond to the stimulus--it is assumed that the stressed or compromised individual

PAGE 40

29 will respond less effectively (and consequently in a different time frame) than the healthy individual. Michelsson's (1971) data for 158 normal children over the span of their first year suggest that the latency remains stable during that time. However, she also makes the excellent point that the cry latency may vary slightly due to the alertness and state of the infant. The summarized latency data from seven cry studies are presented in Table 1-4. As may be seen, most of the reported mean latency values are consistent with one another, with the possible exception of the three "high" values from the Michelsson (1971) (1.8 seconds), and the Lester and Zeskind (1976, 1978) underweight and high complication populations. Again, the data from Lester and Zeskind are revealing. Their low complication and full weight infants exhibit latency values which agree with values reported by many other investigators. However, normal "at risk" infants in those studies exhibit values which are greater than those cited by others; a relationship which probably reflects their stressed CNS status. The relative agreement between temporal values obtained between studies is probably partially due to the fact that latency measurements can made from calibrated time lines which tend to be quite accurate. Fisichelli and Karelitz (1963) used a graphic milliammeter operating at five seconds per inch to make their measurements. Other investigators have used spectrograms to make temporal measurements; timing is the most accurate dimension of the t-f-a display. One source of potential variability in temporal measurements comes from the fact that most studies have utilized a verbal signal (e.g., "Now") to identify the application of the cry

PAGE 41

30 Table 1-4 -Studies of latency of normal neonatal cry. Study 1. Fisichelli & Karelitz, 1963 2. Michelsson, 1971 3. Caldwell & Leeper,1974 4. Lester,1976 5. Lester,1978 6. Zeskind & Lester,1978 7. Thaden & Koivisto,1980 8. Zeskind & Lester,1981 *Not provided N Age Class 17 1-30 50 1-10 26 2 m 3 4 2 f 3 4 12 2 flwt. 12 2 unwt. 12 2 flwt. 12 2 unwt. 24 2 lcomp. 24 2 hcomp. 38 19 19 19 1 5 2 2 2 ave-PI low-PI hi-PI X sec 1.54 1.80 1.46 1.41 1.67 1.45 1.50 1.55 1.47 1.80 1.56 1. 72 1.37 2.08 1.40 1.40 1.20 1. 76 1.66 SD 0.66 Range sec 0-2.0 1.2-2.5 0.27 0.30 0.94 0.40 0.51 0.54 0.26 0.69 0.46 0.29 0.62 1.10 0.80 0.4-3.6 0.70 0.4-3.2 0.30 0.66 0.90

PAGE 42

31 stimulus. If latency is to serve as an indicator of sensorineural and motoric processing time, differences of milliseconds might be significant. Thus, the resolution of the latency metric may be degraded by the variability inherent in using a vocal correlate of the stimulus. Duration. Duration, or cry-time, refers to the expirational phonation time, often only the first cry, but sometimes until the end of a specified phonation in the nth cry cycle. Duration represents a temporal measure, and thus is reasonably easy to accurately determine--at least prior to the lower amplitude portions of the decaying signal. However, as may be seen in the data presented in Table 1-5, there is variability in the obtained values: 1.2 seconds for the first cry of 33 newborns (Colton and Steinschneider, 1980) to 6.45 seconds for the first cry of 24 low complication newborns (Zeskind and Lester, 1978). This difference between these sample values is nearly 540%! Some of these discrepancies are undoubtedly due to the lack of agreement among researchers as to the boundaries of the event to be measured. For example, Thaden and Koivisto (1980) measured the first phonation between the first two inspirations. However, Michelsson (1980) indicates that the continuity of phonation is often broken, and notes that she would only accept for measurement cries of more than 0.4 seconds. Thus, the values from previous studies are often highly variable, but in disagreement. For e x ample, Lester (1976) presents data indicating that low complication infants have shorter cries than high complication babies, while the sa m e author (1978) also has provided data suggesting that full weight newborns cry longer than underweight ones. The data from the former

PAGE 43

32 Table 1-5 -Studies of duration of normal neonatal cry. Study 1. Ringel & Kluppel,1964 2. Sheppard & Lane,1968 3. Wolff, 1969 N Age (days) Class X Dur. SD (sec.) 10 1-2 4m/6f 1.5 0.62 1 1-9 m 0.6 13-21 0.5 25-33 0.5 1 1-9 f 0.1 13-21 0.4 25-33 0.6 1 3 4.1 Ind. Dur. * * 4. Michelsson,1971 50 1-10 2.0 5. Caldwell & Leeper,1974 6. Prescott,1975 7. Lester,1976 8. Lester,1978 9. Lester & Zeskind,1978 10.Zeskind & Lester,1978 26 4 12 12 12 12 40 24 24 11.Colton & Stein33 schneider,1980 33 12.Thoden & 38 Koivisto,1980 38 *Not provided 1 m 2 3 1 f 2 3 1-10 2 flwt. 2 unwt. 2 flwt. 2 unwt. 2 2 l~omp. 2 hcomp. 7 7 1 5 m f * 0.61 0.60 0.68 0.67 0.68 0.62 1.2 0.55 1.5 0.39 2.7 1.15 1. 3 o. 54 4.8 0.38 2.5 2.23 6.5 3.62 3. 8 1. 79 1.2 0.39 1.2 0.31 5.0 2.80 5.2 2.30 * Ind. SD * * 0.14 0.20 0.22 0.34 0.19 0.24 * * * Range 0.6-4.0 k 1.6-3.6 * * * *

PAGE 44

33 study are particularly difficult to reconcile in light of all other information presented by those authors. Michelsson (1980) warns that since the intensity of the phonational signal trails off slowly, it may be necessary to boost the output gain in order to identify the end of the cry. Moreover, if input level controls are calibrated to avoid clipping the most intense portion of the cry signal, the least intense portions of the signal (at the end of phonation, for example) may be lost. Even if the low amplitude parts of the signal are captured on the recording, the limitations of the analysis procedue may not be adequate to facilitate detection of the quiet end of the phonational signal. In any case, soce of the values reported for the duration of the first expirational phonation are too extreme to be accepted without question. In all likelihood, differences in definitions of the event, measurement procedures, and instrumentational difficulties may account for the discrepancies. Summary of Previous Studies In general, the cited studies have reported a broad range of results for specific parameters. The sources of variation may be attributed to the following issues. 1) Researchers have often disagreed as to a) which acoustic or temporal cry features are important to study, b) what the definitions of a specific parameter should be and c) the best measurement procedure to apply to the task. Consequently, although a good deal of normative data has been generated for certain parameters, integration of data from different studies frequently has been difficult. Thus, an understanding of the cry

PAGE 45

34 event must be predicated on appropriate and accepted physiological and acoustic constructs. 2) Until recently, most investigations of cry have tended to only define normality in terms of its difference from abnormality, and relatively few investigators have sought to establish acoustic/temporal baseline data for normal infants. Furt h er, recent indications are that most studies which have been focused on normal babies have not applied stringent enough criteria for normality to subject selection. Consequently, the data currently presently available for most neonatal cry parameters should be reconsidered in light of such realizations. Moreover, while some differences have have been identified between the cries of normal and abnormal newborns, differences are also clear between the cries of different groups of normal infants. The introduction of developmental and somatic criteria into this type of research (e.g., Zeskind and Lester, 1978) provided additional dimensions which could be used to establish bounds to which specific data could be applied. Therefore, it was appropriate to attempt to establish a range of acoustical and temporal cry values for infants who have been clearly defined as normal and low risk. Specific Aims The cry of the newborn infant is a comple x signal which, through its components, reflects a host of underlying physiological processes. If the cry is to be meaningfully studied in all its diverse manifestations, it first is necessary to describe it operationally. Clearly, such a process is fundamental to the understanding of the cry

PAGE 46

35 as a communicative act--as well as to the development of cry-based diagnostic systems which seek to discriminate those cries which deviate from the normal, or evaluate cry characteristics for evidence of underlying physiological processes. While a composite review of the cry literature yields data for a broad range of parameters, few studies have been carried out which have applied even a high proportion of these analyses to a single data base. Finally, several types of analysis are now available which previously could not been utilized in cry research; they have the potential to provide excellent information about the cry signal. Consequently, an attempt was made to obtain acoustic and temporal values for neonatal cries in order to enhance the present understanding of the nature and use of the cry signal. There were three major goals: 1) Incorporation and application of the range of cry parameters used in previous studies to a single subject sample. It was expected that this research would provide normal data for an extensive range of acoustic/temporal parameters. 2) Clarification of existing relationships based on cry data obtained from previous studies. The specific areas included a) fundamental frequency (including range and mode), b) spectral (formant) peaks, c) peak intensity level, d) latency and e) duration. 3) Broadening of the acoustic/temporal cry data base by the introduction of appropriate measures previously unused in cry research. They included a) jitter and b) spectral distribution (i.e, long-term power spectra).

PAGE 47

36 Within the context of these goals, a number of specific hypotheses were tested. Some of these hypotheses are based on previously obtained data, and some are based on subjective observations. They are as follows: 1) Fundamental frequency (FO) values for cries of medically and developmentally normal infants would be within the lower range (approximately 350-500 Hz) of values previously reported (See Table 1-1). 2) Since infants may be expected to exhibit less neuromuscular control than do older individuals, jitter values for neonates will be higher than those reported in the literature for adults. Since no data on jitter presently exists in the infant literature, this information would be new in this area. 3) The reported values for normal crying formant frequencies (See Table 1-2) are correct. That is, major spectral peaks (for cry) would be located at approximately 1400-1600 Hz for Fl, 3100-3300 Hz for F2, and 5300-5400 Hz for F3, which relate in turn to vocal tract size in the neonate. 4) Maximum absolute intensity level values (re: SPL) during crying would be higher than those previously reported (See Table 1-3). That is, maximum intensity levels at a distance of 24" from the microphone were expected to be approximately 70-85 dB SPL. 5) Obtained latency values would be consistent with those values reported previously in the literature (See Table 1-4). Specifically, it was expected that the latency between stimulus application and the onset of the cry would be approximately 1.4-1.6 seconds. This value should reflect the time necessary

PAGE 48

37 for the healthy infant to process the stimulus and begin the response. 6) Based on the values reported from previous studies (See Table 1-5), it was expected that the cry durations (for each expiratory phonation) would be highly variable (mean 0.5-5.0 seconds with a standard deviation of 2.5-3.5 seconds). Furthermore, it was hypothesized that the duration of the first cry would be longer than subsequent cries--and that first cry duration would be positively correlated with cry latency. Finally, it was hypothesized (from subjective observations) that cry reponses could be sorted into two categories: one in which the cry began with a series of short "cough-like" bursts, and a second in which the cry was initiated with phonation. 7) Based on the hypotheses of Wolff (1969), Truby and Lind (1965), and Lester and Zeskind (1982), it was speculated that the firsc phonations would be different that those of second phonations. Specifically, it was expected that fundamental frequency values, jitter values, formant frequencies, phonation durations, and peak intensity levels would all be different for the first than for the second cry. In the perceptual judgements, it was expected that the panels would rate the first cries as generally more strained and harsh than the second. Furthermore, the first cry was expected to have characteristics associated with the pain stimulus, while the second cry was expected to be more moderate, progressing toward a "basic" cry.

PAGE 49

CHAPTER II METHOD Overview The methods that were used to obtain and analyze the pain cries of a sample of normal, healthy newborn infants will be described below. Specifically, a mild pain stimulus (pressure from a cuticle stick attached to an autolet) associated with the administration of the Brazelton Neonatal Behavioral Assessment Scale (BNBAS) was used to elicit cries from 31 nor m al healthy infants between the ages of 26 and 30 days--and these cries were recorded for later analysis. That is, six frequency, intensity and durational analyses were carried out in an attempt to define the ranges and characteristics of normal neonatal cry. These data provided an "acoustic profile' for each infant and for the sample. In addition, each cry was judged for "strain" and "harshness" by a panel of trained listeners. Three tests were employed to insure that each infant was physically and behaviorally within normal limits: a medical history (interview with parent), an anthropometric measure and a standardized behavioral assessment test. Subject Selection This project represented an atte m pt to establish stable data for the pain cries for a homogeneous sample of nor m al, healthy infants at the end of the first month life. Consequently, rather specific subject selection criteria were needed. For inclusion in the sample, each infant was required to be full term (33-42 weeks gestational age) 38

PAGE 50

39 and 26-30 days postnatal age. In addition, three criteria were used to insure that each neonate was medically, physically and developmentally within normal or ow-risk" limits. First, the parent was interviewed to deterilline whether the infant's history included any pre-, perior post-natal medical complications. Second, the Ponderal Index (PI), a ratio of body weight to length, was calculated to determine if the child had appropriate fetal and postnatal growth. Finally, developmental and behavioral status was determined by administration of the Brazelton Neonatal Behavioral Assessment Scale (Brazelton, 1976). This test will be referred to subsequently as The BNBAS" or I "The Brazelton." Each of these procedures are discussed below in greater detail. Officials of the Alachua County (Florida) Board of Health provided the experimenters with photostatic copies of all certificates of birth filed and recorded in the Alachua County area. The experimenter then attempted to contact the parents of each newborn by telephone. If the parent could be reached, the details of the procedures were explained and the child's participation requested. Additional subjects were obtained by referral from Gainesville, Florida, pediatricians. In this case, pediatricians were contacted first by letter (See Appendix C) and then by phone and briefed about the general nature of this project. Those physicians who agreed to cooperate referred appropriate patients to the investigators for testing and recording. Parents of prospective subjects were provided with a letter detailing the interests and procedures of this project (See Appendix D).

PAGE 51

40 Appendices I and J contain the age, somatic and Brazelton data for each infant. In all, 31 newborns (22 males and 9 females) were included in the sample. Each infant was of full term gestational age at delivery, and was between 26-30 days (X=28.4, SD=l.0) postnatal age. Birthweights for the babies ranged from 2807 to 4877 grams (X=3645, SD=466), and birth lengths were from 48.3 to 55.9 ems (X=52.8, SD=2.l). All infants had normal and uneventful medical histories, as reported by their parents. Finally, all infants exhibited appropriate weight for length, indicating proper preand postnatal nutrition (See Appendix I) and were developmentally within normal limits according to their scores on the BNBAS (See Appendix J ) Determinations of Normality The parent interview Procedure When parents were contacted concerning the Brazelton test, they were told that the research would concern "normal, healthy" babies. Hence, they were asked whether complications had been noted in the child's pre-, perior post-natal history. If the parent agreed to have the baby participate in the study, she/he was asked to bring birth information to the experimental session. At that time, the child's birth history (weight, length, head circumference, Apgar scores, length of labor, birth order, anesthesia g iven mother durin g labor, and abnormalities of labor) and post-natal history (current statistics, physician's name, findings of post-hospital medical examinations) were reviewed. Indications of medical problems resulted in rejection from the subject pool (although in all cases the

PAGE 52

41 Brazelton was still administered). All parents were asked to sign an informed consent form (See Appendix G) permitting the investi g ators to contact the physician for additional medical information. Appropriate fetal g rowth While a nu m ber of earlier cry studi e s have assessed nor m alit y on che basis of birthweight alone (e.g., normal infants 2500 g rams), Rohrer's Pondera! Index (PI) provides a means for determining the appropriateness of a newborn's body w~ight for his length (Lubch~nco et al., 1966; Miller and Hassanein, 1971)) (See Appendix E for a more detailed discussion of the PI.). The PI may be calculated as follows: PI~ weight (g) 100 / crown-heel length (cm)** 3. Statistics provided during the interview were used to calculate PI scores for all potential subjects. Since children falling outside the (3-97 percentile) range have been shown to be at higher risk for mordibity ( M iller and Hassanein, 1971), only those infants with birthweight and PI's within that percentile range were accepted as subjects. Thus, acceptable ranges for boys were 5.8-10.1 lbs. and 18.2-21.5 inches, while ranges for girls were 5. 8 -9.4 lbs. and 18.5-21.1 inches (Stuart and Meredith, 1980). Accordingly, PI ratios of 2.21-2.81 were considered indicative of normal fetal growth ( M iller and Hassanein, 1971), and only infants falling within that range were included for participation in the study. It might be noted that the same range of values was used also by Zeskind and Lester (1981) to define a "fullweight" neonatal sample for cry analysis.

PAGE 53

42 Developmental status All potential subjects were tested for normality by means of the BNBAS. (See Appendix F for a more detailed description.) Since the administration of the Brazelton includes a pain cry elicited from a gentle poke, those cries were recorded for later analysis. Thus, the Brazelton was used in this study 1) to determine that the infant was behaviorally within normal limits and 2) to elicit a standardized cry which could be recorded for later analysis. The pain stimulus of the Brazelton may be administered only if the infant is in a resting or sleep state. Consequently, infants who were alert durin g the entire test session could not be included in the subject sample. In addition, infants who did not respond to the pain stimulus with a full cry (state 6 on the consciousness scale utilized in the Brazelton) could not be included in the sample. A total of 48 infants were administered the BNBAS. Besides the 31 who were included in the subject sample, another 17 infants were disqualified on the following bases: 1) two infants were not judged to be developmentally within normal limits, 2) five were too alert throughout the test to permit the administration of the pain stimulus, 3) five gave no vocal response at all to the stimulus, and 4) the final five infants produced responses that were too "weak'' to be considered cries. The BNBAS is designed to be administered durng the newborn period (i.e., up to 30 days). However, it should be delayed until the third day of life to allow the infant to stabilize; i.e., to recover from the stresses of delivery and / or the effects of medications used in the birthing process (Als et al., 1977). This factor presented a

PAGE 54

43 problem for data collection, since local hospitals and birthing centers discharge all apparently normal newborns prior to the third day. The hesitancy of new parents to participate in scientific studies external to the hospital environment is understandable. Consequently, an age range from 26 to 30 days was selected as the time frame for this research. Only trained and certified evaluators may administer the Brazelton. Since the author does not have this training, the cooperation of a qualified individual (Dr. Susan Armstrong--see Appendix F) was enlisted to perform the test and to interpret the results for the parents. Collection of Cry Samples Elicitation of the pain cry The administration of the Brazelton involves the elicitation of several different responses (e.g., pain, startle, defensive), any of which may include a cry from the neonate. Although the entire test was recorded, the pain cry was the only response of interest, and the only one subjected to analysis. Each cry episode was initiated by an audible click (from the autolet) which signified the onset of the cry stimulus. The sample of interest was completed at the end of the second cry cycle (i.e., the second expiratory phonation). Recording procedures The subjects were recorded in a sound treated booth (e.g., Industrial Acoustics Company, No. 1204.A). They were examined on a padded table, with a mark indicated on the sheet for head positioning

PAGE 55

44 during the application of the pain stimulus. A Quest 1.125" inch PZT ceramic omnidirectional microphone was stabilized on a boom 24" above the surface of the bed. The control of head position and microphone distance permitted the measurement of absolute intensity level. All acoustic events associated with the Brazelton test were recorded at 7.5 inches/second (ips) on a laboratory quality tape recorder (Teac A-60) using 1.5 mil magnetic tape. A high quality cassette recorder (JVC DD-9) was employed as a backup. A calibration tone (1000 Hz at a 110 dB) from a Quest C-12 piston phone calibrator was recorded (at 0dB Vu) on the experimental tapes before and after each experimental session. Tape recorder calibration procedures are outlined in Appendix H. Master tapes from the taping sessions were stored in a cool, dry area. High quality copies were made of the sections containing the cries of interest; all analyses were carried out on the copies. Analysis Procedures Acoustic and temporal analyses Each pain cry sample was analyzed using six procedures. The following natural cry features were examined: 1) crying fundamental frequency, 2) jitter (JIT), 3) formant frequencies (Fl-F5), 4) long term spectral distribution (LTS), 5) peak intensity level and 6) timing (i.e., latency and duration). Descriptive statistics were generated for each parameter within each analysis vector. Crying fundamental frequency. Fundamental frequency values were determined as follows. The pri1 a ary analysis was carried out on the IASCP Fundamental Frequency Indicator (FFI). Corroboration of the

PAGE 56

45 mean FO values was determined from analysis of the harmonic structure of narrow band time-frequency-amplitude (t-f-a) spectrograms of the cries and from the lowest modal peaks of Fast Fourier Transform (FFT) power spectra generated for the cries. Each of these techniques is outlined below. The Fundamental Frequency Indicator (FFI). FFI consists of a series of lowpass filters, with cutoffs at half-octave intervals, coupled to high-speed switching circuits which are controlled by a logic system (Hollien, 1981). FFI measures each wave by producing a string of pulses--each pulse marking a boundary of the fundamental period from the complex cry waves--which are delivered to the computer. The number of clicks generated between pulses by an electronic clock (lOOK/sec) is resolved into the time period for each fundamental cycle. The speech signal is represented as a series of cycle duration interval times: c(l),c(2), ,c(n) where n is the number of cycles detected in the sample being studied. A standard histogram is constructed to summarize the frequency of occurrence of each length of cycle value. The cycle/period information c(i) is transformed to frequency f(i) in Hz where f(i)=l/c(i). The histogram of frequency/period information was used to compute cry parameters such as mean value (in Hz and semitones (ST)), the standard deviation (in ST), the modal frequency (in Hz and ST), the highest and lowest frequencies (in Hz and ST) detected in the sample, and the range (in ST). Finally, since the upper limit of FFI (800 Hz) is inadequate to deal with the range of FO values that may be produced by a crying infant, it was necessary to play

PAGE 57

46 each recording at half speed through the system and then double the output. Spectrographic Analysis (t-f-a). Narrow band spectrograms have been the most widely used method of estimating crying fundamental frequency in the past. Although the spectrogram provides adequate temporal representation of the signal, resolution of frequency is poor by contemporary standards, and there is no way to accurately determine either the mean or the mode of F0. As mentioned above, calculation of F0 is accomplished by choosing a point (in time) for measurement, counting the nth harmonic overtone, using a template to estimate the frequency of that overtone, and dividing by n. In this study, a Voice Identification Model 700 spectrograph was employed to estimate the mean F0 of the cries in question. Fast Fourier Transform Analysis (FFT). This analysis procedure provides information on the relative amount of energy within 40 contiguous one-sixth octave bandwidths. A power spectral curve is developed which displays the distribution of energy (over time) for the signal in question. Modal F0 may be identified as the lowest peak in the curve. A Princeton 4512 Fast Fourier Transform Spectrum Analyzer was utilized for this procedure. Vocal jitter (JIT). Since the period of each laryngeal cycle was obtained during extraction of fundamental frequency information, that same information was employed to calculate the temporal variability of adjacent periods in terms of mean jitter (in percent). The calculation of a sample's jitter may be expressed as

PAGE 58

47 LIP1-P2l/(Pl+P2/2) n where Pl and P2 are the periods of adjacent cycles within a phonational sample, and n is the total number of adjacent pairs examined. Thus, jitter is the difference in percent between the frequency of each pair of adjacent cycles, divided by the mean frequency of those two cycles. However, measurement of this potentially important cry parameter is difficult for two reasons. First, since jitter is calculated from the changes in frequency between adjacent periods, continuous increases or decreases in frequency may bias the results. Consequently, most studies of jitter have used steady-state phonation for analysis. Obviously, it is not possible to control the vocal output of newborns, nor is it possible to always extract a steady-state portion of the cry for analysis. Second, aperiodicity ~ithin the cry signal may result in frequencies that both exceed the rules for frequency continuity (i.e., the cast limit) set up within FFI and/or restrict the data utilized in the calculation of jitter. Since FFI will only accept contiguous points from a noisy phonation, the data utilized in jitter calculations can only be as variable as the cast limit. Thus, use of the cast subroutine may result in a diminished representation of the true variability of crying phonation. Conversely, if the cast subroutine is not employed, the noise component introduced into the calculations would result in a jitter value perhaps more accurate relative to the true variability of the signal, but less related to actual phonation than would be desirable. In this study, the decison was made to use the normally utilized cast

PAGE 59

48 limit of six semitones, which would still provide an indication of the variability within each cry, and would allow for comparison to data generated for adults. In addition, the samples with the highest and lowest jitter values were checked using another IASCP software approach known as the harmonics/noise (H/N) ratio. In this procedure, the signal is digitized and a cursor is used to mark the boundaries of successive periods. The length of each marked period is then automatically calculated and the resultant jitter value computed. In spite of the difficulties outlined above, a relative jitter value can provide direct quantification of the variability within crying phonation. Crying may be considered to represent a particular but extreme example of phonation, with variability that is dependent upon the nature of the stimulus, the infant's internal state and laryngeal dynamics. Because there is no cry jitter information preceding the data generated by this study, it should be considered exploratory in nature. Formant frequencies (Fl-F5). As noted in Chapter I, a number of investigators have reported values for the first three frequency formants of neonatal cries. These regions of concentrated energy reflect the resonances imposed on the acoustic signal by the anatomical characteristics of the vocal tract. While most workers have used t-f-a spectrograms as a basis for analysis, some (e.g., Gardosik et al., 1980) have employed power spectra. The analysis of formant values undertaken in this study utilized both techniques. Determinations of up to five formant frequencies for each phonational sample were filade from power spectra generated on the Princeton 4512

PAGE 60

49 FFT spectrum analyzer, and corroboration of those values was made from spectrogrphic analysis (Voice Identification Model 700 spectrograph). In addition, mean formant frequencies (for all cries, for first and for second phonations) were obtained from the average power spectral curve generated for the long term spectral analysis. Long term spectral analysis (LTS). Analysis of power spectra has been used relatively successfully in the development of speaker identification techniques (e.g., Doherty, 1975) and the evaluation of voice and speech disorders (e.g. Frokjaer-Jensen and Prytz, 1974; Wendler et al., 1980), as well as in cry research (Colton et al., iu press). In contrast to the specification of formant frequencies, LTS provides information about the relative distribution of energy throughout the frequency spectrum during vocalization. In infants, it should provide information about energy emphasis within general bandwidths (i.e., broad changes in resonance) and the relative amounts of noise within a phonation (peak to spectral "floor" distance). However, since very little spectral data exists as yet, the information generated here should be considered tentative. Hopefully these data will be compared eventually to data developed for other normal infants as well as compromised samples. The system utilized at IASCP includes a Princeton 4512 FFT spectrum analyzer coupled to the PDP-11/23 computer. This vector uses 1/6 octave bands to generate a 33 parameter power spectrum curve covering the frequency range from 140 to 7,220 Hz. Intensity. Four previous studies have reported peak intensity levels in the crying infant (Ringel and Kluppel, 1964; Sheppard and Lane, 1968; Zeskind and Lester, 1976; Colton and Steinschneider,

PAGE 61

50 198Q). As noted in Chapter 1, the reported values have been wi
PAGE 62

C Pl f P2 ST VI a b d e g h Figure 2-1. Scl1er 1atic of the neo11atal pain cry. Uppercase let terr; denote major events within the cry. Lowercase letter define the segments within the cry. ST--Sti m ulus Applic:ation 1.!_--First Non-l10natory Vocalizations Pl--First Phonation V2--Second Non-phonatory Vocalizations P2--Second Phonation Legend ab--ST to Vl onset -t--ii of Vls bd--Vl on se t to Pl onset aJ---s1 i to Pl onset ab/ad--ST to Vl onset or Pl onset de--Pl duration f--11 of V2s eg--Pl offset to P2 onset gh--P2 duration

PAGE 63

52 (stimulus processing and response time) if the infant did not respond first with a pre-phonational vocalization. 4) Vl-Pl--The elapsed time (in ms) from the first pre-phonatory vocalization to the onset of the first phonation. This measure represented a transition from the interruptive response of the pre-phonation to the beginning of a release of air. 5) St-Vl/Pl--The elapsed time (in ms) from the stimulus application to either the first pre-phonatory vocalization or the first phonation. This parameter combines the first (St-Vl) and third (St-Pl) parameters to derive an accurate measure of the actual latency from stimulus application to the first vocal response. 6) Pl On-Off--Duration (in ms) of the first phonation, from onset to offset. This measure describes the duration of the first vocal expiratory release after the stimulus. 7) Pl-P2--The elapsed time (in ms) from the offset of the first phonation to the onset of the second phonation. This parameter relates to the time required for respiratory recovery, inhalation, aud readjustment for the next phonation. 8) #Vs Pl-P2--A simple count of the number of non-phonatory bursts between the first and second phonation. This measure indicates the proclivity for interruption within the second cry cycle. 9) P2 On-Off--Duration (in ms) of the first phonation, from onset to offset. This is a measure of the time expended in energy output in the second phonation. In order to measure each segment of the cry episode, the graphic level recorder (described in the previous section on intensity) was utilized. The temporal segments in the cry episode could be identfied

PAGE 64

53 and measured by noting the boundaries indicated by sudden changes in intensity. Perceptual analyses This project was developed as an attempt to establish the acoustic and temporal correlates of the cry signal. However, certain details of the cry signal tend to be missed by the types of electoacoustical analyses used here. For example, although it was clear from listening that some phonations were noisy, this parameter was difficult to quantify. Moreover, most of the analysis procedures used in this project allow for the examination of a signal independant of time (i.e., rather than moment by moment). Consequently, transient events within the cry signal tended to be overlooked. While listening to the cries, it became clear that a thorough description of the signals required the detailing of particular types of events within each signal. In addition, it was the author's judgement that first cries were rougher and more strained than second cries, although that judgement could not be verified without testing other individuals. As a result, two different types of perceptual analyses were carried out. In the first, the author simply rated each cry for the presence of four characteristics: 1) roughness--briefly intermittent to continuous and "saturated" levels of embedded noise (aperiodic phonation or "dysphonation"). 2) register shifts--"voice breaks" or sudden transitions in vocal output from one register to another (generally a higher one) and back again.

PAGE 65

54 3) frequency variability--extreme fluctuations in pitch. These broad variations in frequency could be glides that were increasing or decreasing in frequency, or wavers in which the frequency seemed to "wander," 4) vocal fry--this phenomenon tended to occur at the end of a sustained phonation, when available air to power the cry was greatly reduced. The second analysis procedure focused upon the perceived level of roughness and strain in each cry. Roughness was broadly defined for the task as noise, aperiodicity or harshness in the signal during phonation. It related to the first parameter of the author's judgements. On the other hand, strain related to the perce~tion of inordinate exertion during phonation, and probably would include the perception of fry (#4 above). For this procedure, a panel of five trained listeners--speech pathologists and/or phoneticians who were all specialists in voice--were asked to rate each cry on a five point scale (l=least, S=most) for levels of strain and roughness. Statistical Analysis Procedures The procedures outlined above resulted in a data-base consisting of measurements from six acoustic/temporal analysis procedures (involving 109 parameters), as well as perceptual judgements by a panel of trained listeners and by the author. This study was intended primarily to be descriptive in nature; consequently, means and standard deviations for all phonations were determined for all parameters.

PAGE 66

55 Moreover, one aim of this study was the comparison of first and second phonation (Pl and P2) cry characteristics for each cry. Thus, means and standard deviations for Pl and P2 were calculated for all frequency, formant, long term spectral and intensity parameters. In addition, the mean and standard deviation of all listeners' perceptual judgements (relative to roughness and strain) was calculated for each phonation. Paired comparison t-tests were applied to the raw data for Pl and P2 from each of these acoustic and perceptual parameters. Finally, in an effort to determine the strength of relationship between each parameter and all other parameters, a Pearson correlation matrix was calculated.

PAGE 67

CHAPTER III RESULTS The pain cries of 31 infants were studied. Subjects, who were between the ages of 26 and 30 days, were administered the BNBAS; the cries were elicited by the pain stimulus (or "poke") administered as part of that examination. All infants whose cries were analyzed were found to be within normal limits both in terms of body size (on the Pondera! Index--Appendix I) and developmentally (on the Brazelton--Appendix J). As was discussed in Chapter II, the recorded cries were subjected to six acoustic and temporal analyses: 1) fundamental frequency, 2) jitter, 3) forlliant frequencies 4) long-term spectral composition, 5) intensity and 6) timing. Moreover, a panel of trained listeners perceptually judged each cry in an attempt to characterize levels of "roughness" and "strain." The results obtained from each of these procedures are presented below. Acoustical Analyses Fundamental Frequency (FO) As stated, a recording was made of the first two phonational portions of the cry response to a pain stimulus. Each separate phonation was subsequently rerecorded to form a (continuous) cry sample lasting approximately twenty seconds. Analyses then were carried out for the following fundawental frequency parameters using the IASCP Fundamental Frequency Indicator (FFI): 1) mean (X) frequency 56

PAGE 68

57 in Hz and semitones (ST), 2) standard deviation (SD) of that frequency in ST, 3) modal (Mode) frequency (Hz and ST), 4) lowest (Lo) detected frequency (Hz and ST), 5) highest (Hi) detected frequency (Hz and ST) and 6) range (ST). In addition, the obtained mean FO values were confirmed by examination of modal peaks occurring on long-term spectral plots and by consideration of FO from narrow-band time-frequency-amplitude (t-f-a) spectrograms. Analysis of a related vector, jitter (JIT), was carried out using the frequency data generated by FFI; these values are presented in the fundamental frequency tables as well but will be discussed in a later section. The results of the individual FO analyses (as well as group means and standard deviations for each parameter) for the first two phonational portions of each cry* are presented in Table 3-1. The verification of the fundamental frequency data are presented in Table As may be seen from consideration of Table 3-1, a wide range of individual mean FO values was found for the cries; these values ranged from 312 to 1299 Hz (X=505 Hz, SD=3.0 ST), a difference of more than two octaves. However, upon closer examination (see the scatterplot presented in Figure 3-1), the data clustered into two frequency categories. Of 61 cries analyzed, 57 (93.4%) registered a mean FO below 600 Hz, while the remaining four (6.6%) were above 900 Hz. When the means of each of these clusters were calculated (See Table 3-3), the average of the low frequency cries dropped to 465 Hz (SD=3.0 ST), and the "hyperphonational" cries were more than one octave higher at >'< There are two samples for all but one (itll) of the 31 infants. In that case, the baby achieved one full cry in response ~o the stimulus but then quieted.

PAGE 69

58 Table 3-1. Fundamental frequency values for first phonation (Pl) and second phonation (P2) cries. Values were generated using the IASCP Fundamental Frequency Indicator (FFI). Frequency values are presented in Hertz (Hz) and/or semitones (ST). --------------------------------------------------------------------------------------------------------------------------Cry x x SD M ode Lo Hi Range % It Hz ST ST Hz ST Hz ST Hz ST ST JIT -------------------------------------------------------------1.1 557 61 4.2 494 59 291 so 1068 73 23 15.0 1.2 495 59 2.5 494 59 288 so 605 63 13 8.0 2.1 488 59 4.3 554 61 330 52 798 67 15 9.6 2.2 435 57 1. 7 440 57 349 53 530 60 7 8.1 3.1 479 58 5.0 440 57 255 48 988 71 23 10.9 3.2 471 58 4.0 466 58 275 49 719 66 17 21.4 4.1 541 61 8.4 523 60 234 46 1319 76 30 11.5 4.2 312 51 11.5 370 54 87 29 1397 77 48 25.3 5.1 381 55 2.8 370 54 257 48 544 61 13 9.4 5.2 445 57 5.1 392 55 246 47 1110 73 26 5.3 6.1 404 56 6.0 392 55 233 46 494 59 13 8.5 6.2 465 58 1.6 440 57 370 54 648 64 10 3.2 7.1 420 56 2.2 415 56 277 49 623 63 14 7.9 7.2 457 58 1.3 466 58 370 54 509 60 6 6.8 8.1 487 59 1.6 466 58 349 53 641 64 11 9.0 8.2 520 60 1.0 523 60 416 56 623 63 7 4.7 9.1 423 56 1.4 440 57 339 52 466 58 6 10.1 9.2 455 58 2.0 440 57 294 so 587 62 12 5.4 10.1 461 58 6.1 440 57 220 45 880 69 24 14.1 10.2 548 61 2.0 554 61 415 56 659 64 8 9.9 11.1 399 55 3.3 415 56 262 48 1175 74 26 10.4 12.1 445 57 10.2 415 56 139 37 1319 76 39 12. 6 12.2 537 60 2.6 494 59 441 57 641 64 7 9.9 13.1 497 59 1.9 494 59 393 55 587 62 7 10.6 13.2 513 60 1.6 523 60 415 56 659 64 8 7.2 14.1 1299 76 1.5 1319 76 848 68 1480 78 10 3.4 14.2 586 62 1.1 554 61 467 58 1397 77 19 4.8 15.1 572 62 2.2 554 61 432 57 726 66 9 11.8 15.2 594 62 1.2 587 62 523 60 698 65 5 5.8 16.1 956 70 1.8 880 69 762 67 1319 76 9 4.7 16.2 595 62 7.1 831 68 277 49 988 71 22 13 .6 17.1 391 55 2.2 392 55 262 48 554 61 13 9.7 17.2 365 54 2.7 349 53 262 48 509 60 12 6.5 18.1 359 53 8.4 466 58 139 37 1047 72 35 13.1 18.2 425 56 2.2 392 55 311 51 554 61 10 9.8 19.1 389 55 5.9 370 54 294 so 784 67 17 14.3 19.2 443 57 3.0 466 58 262 48 610 63 15 7.5 20.1 555 61 1. 9 523 60 360 54 784 67 13 7.6 20.2 574 62 3.2 554 61 370 54 1319 76 22 6.1 21.1 484 59 1.3 466 58 392 55 623 63 8 6.6 21.2 527 60 1.2 523 60 415 56 641 64 8 3.8 22.1 389 55 2.3 392 55 196 43 1397 77 34 16.0 22.2 459 58 2.0 466 58 349 53 554 61 8 13.3 -------------------------------------------------------------

PAGE 70

59 Table 3-1 -continued. -------------------------------------------------------------Cry y X SD Mode Lo Hi Range % It Hz ST ST Hz ST Hz ST Hz ST ST JIT -------------------------------------------------------------23.1 414 56 1. 7 415 56 302 50 587 62 12 6.4 23.2 404 56 2.1 415 56 277 49 539 61 12 10.2 24.1 473 58 1.5 466 58 370 54 523 60 6 4.9 24.2 570 61 1.3 554 61 440 57 784 67 10 3.1 25.1 417 56 1.7 415 56 277 49 554 61 12 6.0 25.2 431 57 4.2 415 56 294 50 1319 76 26 6.4 26.1 478 58 1.8 494 59 293 50 932 70 20 7.6 26.2 457 58 0.7 440 57 392 55 554 61 6 2.2 -27 .1 507 59 1.6 523 60 392 55 587 62 7 6.3 27.2 527 60 1.3 494 59 415 56 622 63 7 4.4 28.1 404 56 2.2 415 56 294 50 494 59 9 10.5 28.2 437 57 1.4 415 56 349 53 554 61 8 6.2 29.1 1126 73 2.0 1245 75 847 68 1371 77 9 8.8 29.2 938 70 3.5 988 71 466 58 1480 78 20 4.1 30.1 416 56 2.9 415 56 254 47 587 62 15 10.4 30.2 394 55 2.4 349 53 330 52 554 61 9 5.7 31.1 419 56 0.9 415 56 370 54 498 59 5 3.4 31.2 413 56 2.0 415 56 349 53 494 59 6 8.6 ----------------------------------------------------------x 505 59 3.0 505 59 347 52 796 66 15 9.0 SD 171 4 2.3 183 5 138 7 318 6 9 4.0 -----------------------------------------------------------------------------------------------------------

PAGE 71

60 Table 3-2. Corroboration of FFI fundamental frequency (F0) values. M ean and modal F0 values were generated for checking from time-frequency-amplitude (TFA) spectrograms (mean) and Fast Fourier Transform (FFT) spectral analys i s (FFT). ----------------------------------------------------------------------------------------------------------------Cry FFI FFI TF A FFT Cry F FI FFI T F A F FT If X Mode x Mode if M ode X M ode --------------------------------------------------------1.1 557 494 570 560 1.2 495 494 500 500 2.1 488 554 460 2.2 435 440 400 430 3.1 479 440 450 480 3.2 471 466 450 460 4.1 541 523 550 560 4.2 312 370 280 5.1 381 370 360 380 5.2 445 392 430 430 6.1 404 392 400 420 6.2 465 440 450 460 7.1 420 415 400 410 7. 2 457 466 460 460 8.1 487 466 490 460 8.2 520 523 525 520 9.1 423 440 430 430 9.2 455 440 500 500 10.1 461 440 450 460 10.2 548 554 550 560 11.1 399 415 400 430 12.1 445 415 440 440 12.2 537 494 525 520 13.1 497 494 420 430 13.2 513 523 450 520 14.1 1299 1319 1330 1290 14.2 586 554 600 590 15.1 572 554 560 570 15.2 594 587 560 570 16.1 956 880 900 940 16.2 595 831 650 590 17.1 391 392 375 440 17.2 365 349 340 360 18.1 359 466 330 320 18.2 425 392 375 410 19.1 389 370 410 380 19.2 443 466 450 460 20.1 555 523 570 590 20.2 574 554 550 590 21.1 484 466 480 480 21.2 527 523 510 520 22.1 389 392 400 410 22.2 459 466 440 440 23.1 414 415 430 410 23.3 404 415 400 410 24.1 473 466 500 580 24.2 570 554 580 590 25.1 417 415 430 410 25.2 431 415 400 410 26.1 478 494 480 480 26.2 457 440 450 460 27.1 507 523 500 500 27.2 527 494 500 500 28.1 404 415 400 410 28.2 437 415 450 450 29.1 1126 1245 1200 1190 29.2 938 98 8 ll00 1120 30.1 416 415 450 460 30.2 394 349 450 460 31.1 419 415 400 400 31.2 413 415 400 410 ------------------------------------------------------------FFI FFI TFA FFT x Mode x x -----------------------------x 505 505 508 510 SD 171 183 187 183 ----------------------------------------------------------------------------------------------------------------------* --could not be determined

PAGE 72

6 1 1.40 .30 1.20 1.10 D 1.00 D 0.90 0.80 0.70 0.60 0.50 Cl:lCb D 0.40 Do oD cP a r:F D 0 4-, D a oJ D ~u D a Q:! lJ-! DcfO D ~ 0.30 0.20 -+------,-----.-------.-----..------.------,c-------t 0 20 40 Fi g ure 3-1. Scatterplot of mean func.a ~e~ t a l frequency values (in kHz) fo r each cry sac.ple. eo

PAGE 73

62 Table 3-3. Summary table of frequency parameters. Mean and standard deviation values are provided for each parameter in Table 3-1. Values are presented also for the categories of 1) low frequency phonations, 2) high frequency phonations, 3) first phonations and 4) second phonations. x x Hz ST SD Mode Lo ST Hz ST Hz ST Hi Range % Hz ST ST JIT -----------------------------------------------------------All Cries (N=61) XA sos 59 3.0 505 59 347 52 796 66 15 9.0 SDA 171 4 2.3 183 5 138 7 318 6 9 4.0 All Low Frequency Cries (XFO ( 700 Hz) (N=57) XA(L) 465 58 3.0 457 58 320 51 753 65 15 8.9 SDA(L) 66 2 2.3 60 2 85 5 281 6 9 4.3 All High Frequency Cries (XFO )700 Hz) (N=4) XA(H) 1080 72 2.2 1053 72 731 65 1413 77 12 5.3 SDA(H) 169 3 0.9 219 4 181 5 81 1 5 2.4 -------------------First Cries Only (N=31) n 517 59 3.3 sos 59 344 51 772 67 16 10.0 SDl 215 5 2.4 183 5 173 7 308 7 9 6.2 Low Frequency First Cries (XlFO (700 Hz) (N=28) Xl(L) 452 57 3.0 457 58 293 49 771 66 16 10.0 SDl(L) 59 2 2.0 60 2 74 5 281 6 9 3.0 High Frequency First Cries (XlFO )700 Hz) (N=3) n(H) 1127 73 1.8 1053 72 819 68 1390 77 10 6.0 SDl(H) 172 3 0.3 219 4 49 1 82 1 1 3.0 -------------------Second Cries Only (N=30) x:z 493 59 2.7 494 59 350 52 762 66 13 7.9 SD2 110 3 2.2 131 4 87 6 312 6 9 5.0 Low Frequency Second Cries (X2FO ( 700 Hz) (N=29) X2(L) 478 58 3.0 464 58 346 52 737 65 13 7.8 SD2(L) 71 3 2.0 66 3 86 6 286 5 9 4.8 High Frequency Second Cries (X2FO )700 Hz) (N=l) X2(H) 938 71 3.5 988 71 466 58 1480 78 20 4.1 ------------------------------------------------------------

PAGE 74

63 1080 Hz (SD=2.2 ST). In addition, it should be noted that the four high frequency cries were produced by three infants; three of the cries were first phonations, while the fourth cry was a second phonation following a high frequency first phonation by one of the three. Thus, the normal cry response is bi-modal. While the normal cries in this study were most often low frequency in nature, a few hyperphonational cries occurred as well, generally in the first phonation following the stimulus. When first cries (Pl) were compared to second cries (P2 ) there were no apparent differences in mean FO (X(Pl)=517 Hz, X(P2)=493 Hz); a paired t-test comparison (See Table 3-4) confirmed this relationship (t=0.76, P=0.27). Further, when the standard deviations (i.e., the variability) of first cries (SD(Pl)=3.3 ST, SD(P2)=2.7) were compared to those of second cries using a paired t-test, the difference, while stronger, still was not significant (t=l.20, P=0.12). Thus, the characteristics of fundamental frequency (and laryngeal dynamics) did not appear to change from the first cry to the second. While the mean provides information concerning the mathematical center of frequency production, the mode describes the frequency produced most consistently. Thus, it was interesting to note the correlation between the values for the two parameters (r=0.95, P < 0.001). Mean value for both parameters was the same (505 Hz), although the modal standard deviation was one semitone greater. In only three cases out of 61 was there more than a two semitone difference between them (4.1--3 ST, 18.1--6 ST, 16.2--7 ST). The strong correspondence between these two parameters suggests that the

PAGE 75

64 Table 3-4. Results of paired difference t-tests between selected frequency parameters of phonation 1 and phonation 2. ------------------------------------------------------------Mean Parameters Diff SD One Tail Tested ST ST S.E. df t P= ----------------------------------------------------------Xl vs. X2 (ST) 0.5 3.6 0.7 29 0.76 .27 SDl vs. SD2 (ST) 0.6 2.6 0.5 29 1.20 .12 Mdl vs. Md2 (ST) 0.4 3.7 0.7 29 0.54 .30 Lol vs. Lo2 (ST) 0.9 7.8 1.4 29 0.65 .26 Hil vs. Hi2 (ST) 0.6 6.5 1.2 29 0.51 .31 Rgl vs. Rg2 (ST) 1.6 10.8 2.0 29 0.80 .28 JTl vs. JT2 (%) 1.5 5.1 0.9 29 1.56 .06* ------------------------------------------------------------------------------------------------------------------------* Significant at the .05 level.

PAGE 76

65 (easier to obtain) modal frequency may provide an index of fundamental frequency behavior that is as accurate as the mean. The distribution of the observed FO values (i.e., waveform periods detected during the analysis of each sample) within the group of cries was quite broad, extending over four octaves (49 ST) between 87 and 1480 Hz. Individual cries varied as little as 5 ST (not quite one-half octave) or as much as 48 ST (or four octaves). However, the means of the detected low and high frequency values were 347 (SD=7 ST) and 796 Hz (SD=6 ST) respectively, a range of about 15 ST (SD=9 ST) or slightly more than 1 octave. Again, there was no difference between the ranges of first and second cries. Thus, while most i n fants cried at approximately the same mean frequency, the variability around that mean was considerable for each infant, indicating little volitional control over the crying act. Jitter The jitter metric is a measure of the percent of change in fundamental frequency from cycle to cycle; further, it provides an index of the variability of vocal fold vibration. As may be seen in Table 3-1, the values obtained for jitter during crying phonation were extremely wide ranging; they extended from 2.2% (#26.2) to 25.3% (#4.2), with a mean of 9.0% and a standard deviation of 4.0%. (The high and low jitter values were validated approximately using the H/N ratio, described in Chapter II; the obtained values using the second procedure were 3.7 and 29.2%.) A significant (t=l.56, P(0.10) difference was found between the jitter values for the first and second cry phonations. The wide range of values obtained for this

PAGE 77

66 parameter are highly correlated with both the high FO standard deviation values (r=0.66, P(0.001) and the wide range of detected frequencies (r=0.61, P (0.001) for FO. Indeed, all these relationships point to the extreme variability of vocal fold vibration during crying, probably due to intermittent aperiodicity during phonation as well as continual changes in respiratory and laryngeal dynamics. Formant Frequencies Formant frequencies are spectral resonances that are generated by the voice. As such, they provide information about the dynamics of the vocal tract during sound production. One of the goals of this study was to determine the formant values that are produced during crying, and to attempt to relate them to the characteristics of the infant vocal tract. The resonance values obtained for each cry are presented in Table 3-5. Three to five peaks were evident on the long term spectral curves generated for each phonational sample (See next section and Table K-1 in Appendix K). If less than five peaks were apparent for a phonational sample, it was assumed (for purposes of calculation) that certain resonances might be obscured by aperiodic energy within the same bandwidth, or by having been "averaged out" over time. In addition, while most of the spectral peaks were confirmed by observation of concentrated energy regions on t-f-a spectrograms, some resonances were not apparent on the spectrograms. This was probably due to the fact that LTS displays energy summated over time, while the t-f-a procedure displays a moment-to-moment energy distribution.

PAGE 78

67 Table 3-5. Formant frequency values for each cry sample. Mean and standard deviation values are presented for first and second phonation, and for all cries. Formant values were derived from long term spectral analysis and t-f-a spectrographic analysis. Cry Fl F2 F3 F4 FS # Hz Hz Hz Hz Hz 1.1 2.1 3.1 4.1 5.1 6.1 7.1 8.1 9.1 10.1 11.1 12.1 13.1 14.1 15.1 16.1 17.1 18.1 19.1 20.1 21.1 22.1 23.1 24.1 25.1 26.1 27.1 28.1 29.1 30.1 31.1 1060 1700 2140 3820 6080 660 1190 3820 7660 940 1340 1910 3820 7660 1060 2400 4820 6080 1190 1510 2400 3820 7660 830 1700 3820 5410 830 1190 1910 3400 4290 940 1340 1910 2400 4820 830 1340 2400 4290 940 1340 2700 3820 5410 1000 2400 3820 6080 830 1340 3030 5410 830 1340 2700 4820 1910 3820 5410 1190 1700 2140 3400 5720 1910 4290 940 1190 2400 3400 4820 660 1060 2400 3820 5410 780 1190 2700 3400 6080 830 1060 1700 3820 6080 940 1340 2400 3400 6080 830 1190 1910 4040 6080 830 1190 2140 3820 6830 940 1910 3820 6830 830 1190 2400 4290 6080 660 880 2260 3030 6030 940 1340 2400 3820 6080 830 1190 2140 3820 6080 1510 2400 4290 6830 940 1340 2140 3820 6080 1000 1510 2400 4290 6080 ---------------------X(Hz) 896 1316 2261 3730 5888 SD(Hz) 134 201 280 433 823 Cry Fl F2 F3 F4 F5 # Hz Hz Hz Hz Hz 1.2 1000 1510 2140 3820 6830 2.2 940 1190 1700 3030 6080 3.2 940 1340 2140 2700 4820 4.2 940 1340 2400 4540 7660 5.2 880 1190 2140 3820 6.2 940 1340 2700 3820 6830 7.2 940 1340 2140 2700 4290 8.2 1060 1510 2400 4290 6830 9.2 1060 1510 2700 3820 6080 10.2 1120 1510 1910 2700 6080 12.2 1000 2140 2700 3400 13.2 940 1340 1910 3820 6830 14.2 1190 2140 3400 5410 15.2 1190 1700 2700 3820 6080 16.2 940 1190 1910 3820 6830 17.2 1060 1340 2400 3400 5410 18.2 830 1060 2140 3400 6080 19.2 940 1340 2700 3400 7660 20.2 1120 1910 3820 6080 21.2 1060 1510 1910 3400 7660 22.2 830 1190 1700 3400 5410 23.2 830 1190 2140 3820 6830 24.2 1190 1700 2400 3400 6830 25.2 830 1190 2400 4290 6080 26.2 940 1340 2400 3820 6080 27.2 940 2400 3820 7660 28.2 940 1510 1910 3820 6080 29.2 2140 3030 6080 30.2 1000 1510 2400 4290 6080 31.2 740 1190 2140 3400 6080 964 1376 2211 3535 6192 108 225 302 456 870 ALL PHONATIONS X(Hz) SD(Hz) Fl 929 126 F2 F3 F4 1347 214 2235 290 3631 452 *--no formant was evident in this frequency range F5 6043 853

PAGE 79

68 Thus, a slight but continuous resonance might be evident using LTS but might not be apparent using spectrographic analysis. Mean values for the five resonant frequencies (Rl to RS) were 930, 1347, 2235, 3631 and 6043 Hz respectively. These peaks were approximately confirmed by generation of the averaged long term spectral data for all phonations (SAe Table 3-7 and Figure 3-2 below). When all spectral data are averaged, five peaks emerge at approximately 940, 1190 2140, 3400 ad 6080 Hz. The differences between means of the respective peaks using these two methods are small enough to be within the standard deviation for each value. Although the spectral peaks represent vocal tract resonances, it is difficult to associate the peaks with particular vocal tract characteristics. Individual cry phonations most often encompassed a range of vowel-like sounds (and vocal tract configurations). Since the frequencies associated with each sound were analyzed over time, the averaged resonant values tend to be related to a general vocal posture during crying rather than to specific vowel or vowel-like sounds. Similarly, the complexity of the obtained spectral peaks and a lack of anatomical data about infants preclude definitive statements concerning the relationship between vocal tract size and resonant frequencies. For example, if the vocal tract transfer function (described in Chapter I) is applied to the first resonant peak (929 Hz), an average vocal tract length of 9.1 cm is described. This value is larger by 1.0-1.5 cm than estimates of the neonatal vocal tract posited by other investigtors, but potentially still within the bounds of accuracy. However, it is possible also that the first peak simply

PAGE 80

69 represents the first harmonic overtone, since it is twice the frequency of the fundamental. If Rl represents only a harmonic overtone, it might be expected that other peaks would be evident at approximate multiples of the fundamental, but there are none. If the second peak is assumed to be the first forhlant, a vocal tract length of 6.3 cm is obtained, slightly smaller than the 7.5-8.0 cm estimation, but also a reasonable estimate of vocal tract size in infants. As discussed earlier, it may be fruitless to apply such modeling procedures to data on infants since very little is known about the physical characteristics of the system being studied. Further, knowledge of the relationship between crying spectral peaks and vocal tract configuration may depend on further evidence relative to the anatomical characteristics of the infant sound production mechanism, as well as data about specific infant sounds and their frequency correlates. In order to test for possible differences in vocal tract configuration between the first and second phonation, formant values were compared using a paired t-test. The results of that procedure are presented in Table 3-6. Only Fl showed a significant difference (P(0.01) due to sequence of phonation. The similarity between the resonances for the first and second cries suggests that the configuration of the vocal tract remains approximately the same during the progress of the cry episode. Long Term Spectral Distribution While the formant frequencies provide information about the resonant characteristics of the vocal tract, e x amination of the entire

PAGE 81

70 Table 3-6. Results of paired difference t-tests between formant frequency values of phonation 1 and phonation 2. Mean Parameters Diff. SD One Tail Tested Hz Hz S.E. df t P= -----------------------------------------------------------Fl 70.4 146.4 28.7 25 2.45 0.01* F2 92. 9 247.8 50.6 23 1.44 0.08 F3 62.6 355.6 68.4 26 0.92 0.31 F4 197.0 629.2 121.1 26 1.63 0.06 F5 265.4 1087.2 213. 2 25 1.25 0.11 -----------------------------------------------------------*significant at the P (0.01 level.

PAGE 82

71 spectrum can provide information about the relative energy levels between frequency intervals. The analysis of spectral distribution (over time) resu1ted in -dB values (re: 1 volt) for 33 frequency intervals. Table K-1 (in Appendix K) contains a listing of the obtained values for each cry. Table 3-7 provides a summary of mean and standard deviation values by frequency interval for 1) all cry phonations, 2) first phonations and 3) second phonations. In turn, Figure 3-2 shows a mean power spectral curve for all cries and Figure 3-3 compares the spectral curves for the first and second phonation, Figures 3-4 and 3-5 display the spectral curves for the first and second phonations with a representation of the standard deviation for each interval on the ordinate. As was noted in the last section, the averaged spectrum for all phonations exhibited peaks at 460, 940, 1190, 2140, 3400 and 6080 Hz, thus closely approximating the mean fundamental frequency value and the formant frequency values obtained in other procedures. The greatest amount of energy was centered between approximately 830 and 2700 Hz, with the maximal peak occurring at 1190 Hz. However, with the exception of the peak at 6080 Hz (i.e., the upper end of the spectrum), the difference between the maximal and minimal peak was only 5 dB. Also, the peak at 3400 Hz appears to represent the beginning of the terminus of the distribution, with the energy rolling off rapidly (10 dB/octave) at the higher frequencies. Table 3-7 and Figures 3-3, 3-4 and 3-5 present the averaged data for the f~rst and second phonations. The first phonation exhibits greater energy levels than the second phonation in the upper and lower

PAGE 83

72 Table 3-7. Averaged long term spectral data for all cry phonations, for first and second phonations. Values represent mean total energy levels for each bin in -dB re: 1.00 volt. All Phan. Frequency Range X SD -dB -dB 140-160 160-180 180-200 200-220 220-260 260-300 300-340 340-380 380-440 440-480 480-560 560-620 620-700 700-780 780-880 880-1000 1000-1120 1120-1260 1260-1420 1420-1600 1600-1800 1800-2020 2020-2260 2260-2540 2540-2860 2860-3200 3200-3600 3600-4040 4040-4540 4540-5100 45.8 46.6 46.0 47.1 46.9 47.2 46.9 50.7 44.6 41.8 45.2 47.6 50.2 46.6 40.5 39. 611 40.1 38. 211 42.5 43.4 44.1 41.2 39. 6il 40.1 41. 7 44.2 43. 611 45.0 48.4 50.8 5100-5720 51. 5 5720-6440 51.3# 6440-7220 53.8 #--spectral peak 4.1 4.3 4.4 4.6 6.2 6.0 6.5 13 .2 13.1 9.8 10.2 10.6 8.9 9.1 9.4 9.8 9.2 9.4 9.5 11.2 13.2 13.9 12.6 13.8 11.4 12.0 13.0 16.6 17.6 17.9 15.8 15.5 16.0 1st Phan. X SD -dB -dB 44.3 2.9 45.1 3.3 44.5 3.6 45.6 3.8 44.6 6.1 45.3 5.2 44.7 5.6 47 .5 11.6 43.6 12.8 42.5 9.2 46.3 9.2 48.6 8.5 50.0 8.6 45.4 9.4 39.8 10.5 39. Bf! 9. 8 40.3 7.3 38. 211 7. 6 41. 6 8. 9 41. 2 8. 2 42.5 12.6 40.5 14.7 38.8 12.2 37.5# 11.3 39.7 9.2 42.9 9.5 41.8 10.4 41. 711 13.5 44.6 14.5 46.9 14.4 48.6 11.5 49.0 12.5 51.3 12.7 *--significant at the 0.05 level. **--significant at the 0.01 level 2nd Phan. X SD -dB -dB 47.4 4.5 48.1 4.8 47.6 4.7 48.6 4.8 49.3 5.4 49.2 6.2 49.1 6.7 53.9 14.1 45.4 13.5 41.0 10.6 44.0 11.2 46.6 12.4 50.3 9.4 47.8 8.8 41.2 8.2 39.5# 10.0 39.9 10.9 39.0# 11.1 43.5 10.2 45. 6 13. 4 45.8 13.8 42. 0 13 .1 40. 611 12. 9 42.7 15.7 43.8 13.1 45.6 14.2 45.31115.2 48.3 18.9 52.3 19.8 54.7 20.4 54.5 19.1 53.61118.1 56.4 18.8 Paired diff. t -3.48 -3.56 -3.32 -3.14 -3.82 -2.76 -3.38 -2.43 -0.94 0.81 1.25 0. 77 -0.16 -1.21 -0.93 0.19 0.19 -0.36 -1.39 -1.90 -1.50 -0.54 -0.68 -2.22 -1.82 -1.06 -1.28 -2.00 -2.50 -2.94 -2.34 -2.00 -2.00 Two Tail P= o. 002~~* 0.001** 0.002** 0.004** 0.00l*~t 0. 010*"'' 0.002** 0.022* 0.353 0.423 0.222 0.447 0.876 0.236 0.363 0.852 0.854 o. 719 0.174 0.067 0.145 0.593 0.499 0.034* 0.080 0.298 0.209 0.055 0.018* 0.006** 0.026* 0.055 0.055

PAGE 84

35 40 +-' g ,45 .. Cl) !... a) -0 I 50 55 'v,\~.I I I I I I I I I -N N N 0J 0J tJ1 u,m --...J COW u, --.Ju,_ ..J:i. CO N 0) 0) N W 0) 0J 0 0 00 o 0 0 0 0 o 0 0 0 0 0 0 I I I I I I I I I I I I I I I I J Fi [.\ ure 3-2. Averag e d lon g term spectral curve for all phonations. -...J w

PAGE 85

35 40 4-' .. 45 Q) L. CD "'O I 50 55 P-1 .I\ '-"', '---""' P-2 ,__ __ ...,__...__...,__._...._ .... -1--1 j j j ---N N N(J.l(>I.P,~ Ul ~"'..Jw_ a>Ncn-cn N 0 0 00 0 0 0 0 0 0 0 ,~ f ,, \ \ \...-1 \ \ \ \ \ \ \ \ \ \ \\_..K)\\ \ I l J_LJ L L.l. __ j I I I i 1-.L_l_l _t_j_L..J._J ~cn~mm 0 =~~~~NNNWW(>l~~~mm wm~~~mw~-o-~~~o~roNm~oro ooooo 0 o 0 o 0 o 0 oo(>loNwN 0 rou o00o0o0 oo Frequency Fi g ure 3-3. Compari s on of the a verag e lon g t e rm spectral curves for the first and second phon a tions.

PAGE 86

35 40 .g ,45 C'l IC .:,. O :::; .;,. "" 0 .a::. a:, N -l> 0 c:, 000 o000000oOo0o0o00C.woNI oOC,:;-000 00 Frequency Ave r aged long term spectral curve for first phonation. T I I ~ T T I I I i I T T ln ;::j ' -I> -I> (.11 u, Ol --..J (l) .&> U, Ol C) 0 0 00 -1>0 ai N Ol O C]) N w 0) ... <,J ... mW ... 0 ~-I> ..... 0 .;,.m N (l) -I> 0 (l) ooo 0 oooooo 0 o 0 o 0 o 0 ooi,., 0 NwN 0 m(>I 000 0 000 oo Frequency Figure 3-5. Averaged lon g term spectral curv e for s eco nd phonation.

PAGE 87

76 frequency regions, while energy levels are almost congruent in the central frequencies, where the peaks are clustered. However, the variability of the second phonation is consistently higher (P(0.01) than that of the first. As is evident fro m Fi g ures 3-4 and 3-5, the lower frequency values are less variable than those in the central and upper frequencies, resulting in significant differences (P (0.05) between Pl and P2 in the spectral levels of the lower eight frequency intervals. In addition, the spectral values for three of the upper si x frequency intervals are higher (P(0.05) for Pl than for P2. The similarity between the shapes of the first and second phonationa l spectral curves indicates that the acoustic energy output during phonation does indeed follow a stylized format. However, an increase in energy level for Pl in the region below the fundamental frequency (with no concomitant change in energy level in t h e central frequencies) suggests a lower signal-to-noise ratio for the first than the second phonation, probably due to an increase in fundamental frequency variability and/or aperiodicity. That is, as the variability in fundamental frequency rises, or as the level of low frequenc y phonational noise increases, so does the energy level in t h e lower frequencies. The higher upper frequency energy levels for Pl relative to P2 probably are not related to noise in the signal, but rather to better hi g h frequenc y resonance fro m greater m outh openin g (F a nt, 197 0) That is, since the first phonation is m ore pro x i m al to t he s ti m u lu s, the articulatory e x cursion in the response ma y be m ore e x tre m e, w ith a resulting increase in high frequency energy.

PAGE 88

77 Intensity The peak intensity of the cry provides an index of the infant's respiratory power and the strength of the response. Peak absolute intensity values (in dB SPL) for each cry phonation are presented in Table 3-8 (as measured 24 inches above the examination table, and directly above the head of the infant). As can be seen, cries ranged in intensity from 75 to 99 dB, with a mean and standard deviation for the first cry of 88.7 and 5.1 dB respectively, and 89.0 and 6.0 dB for the second phonation. There was no significant difference in the intensity of first cries vs. second cries (t=0.35, P=0.36). Furthermore, the likelihood of a louder cry on the first phonation (12/30, or 40%) was about the same as for the second (15/30, or 50%). In three cases (10%), both first and second cry were approximately of the same intensity. Temporal Analyses As discussed in Chapter II, the cry signal was divided into discrete segments that represented specific activities during the cry cycle. Each segments was timed for each infant; the results of that procedure are provided in Table 3-9. Latency. Four different latency parameters were examined: 1) the elapsed time between the stimulus and a pre-phonatory vocalization (St/Vl), 2) the elapsed time between the stimulus and the first phonation (St/Pl), 3) the elapsed time between a pre-phonatory vocalization (if it occurred) and the following first phonation (Vl-Pl) and 4) the elapsed time between the stimulus and the first occurring vocalization (i.e., either pre-phonatory or phonatory) (St /V l-Pl). In addition, two parameters were counts of the number of

PAGE 89

78 Table 3-8. Peak absolute intensity values (in dB SPL) for each cry phonation. The microphone was positioned 24" above the examination table and directly above the infant's head. --------------------------------------------------------------------------Pl P2 Subj. Peak Peak f l Int. Int. -------------------------------------1 97.0 96.0 2 93.0 86.0 3 89.0 89.5 4 94.0 94.5 5 92.0 83.5 6 94.0 95.0 7 87.0 77 .5 8 90.0 97.0 9 95.0 94.0 10 93.0 97.0 11 88.0 12 81.0 88.0 13 78.0 84.0 14 80.5 89.0 15 85.0 89.0 16 95.5 95.0 17 90.0 89.0 18 85.0 87.0 19 87.0 91.0 20 92.0 85.0 21 87.5 89.5 22 78.0 78.0 23 90.0 85.0 24 92.0 99.0 25 82.5 92.5 26 84.0 84.0 27 86.0 75.0 28 90.0 90.0 29 94.0 91.0 30 91.0 94.5 31 87.0 85.0 -------------------------------------x 88.7 89.0 SD 5.1 6,.0

PAGE 90

79 Table 3-9. Timing data for individual subjects. A key for the variable notation is provided on the following page. II Cry St-Vl Vs # Secs St-Pl 1 2.50 2 1.10 3 1.70 4 5 6 1.30 7 1. 93 8 1.12 9 10 11 12 1. 22 13 14 1.48 15 16 17 18 19 1.13 20 4.03 21 1. 73 22 23 24 1.57 25 26 27 1. 47 28 1.30 29 30 0.87 31 1.77 4 1 2 2 2 2 3 1 2 1 1 2 1 4 3 8 St-Pl Secs 5.03 2.10 2.78 1.25 1.17 3.97 2.70 3.00 4.68 1.33 2.27 4.07 0.50 2.20 1.57 1.40 0.82 1.55 4.90 2.47 2.00 0.90 3.73 1. 77 0.92 2.10 2.17 1.32 2.67 10.10 x 1.64 2.4 2.58 SD 0.75 1.8 1.89 St Vl-Pl Vl/Pl Secs Secs 2.53 1.00 1.08 2.67 0.77 1.88 2.85 0.72 0.42 0.87 0.74 2.16 0.63 0.83 1.32 8.33 2.50 1.10 1. 70 1.25 1.17 1.30 1.93 1.12 4.68 1.33 2.27 1.22 0.50 1.48 1.57 1.40 0.82 1.13 4.03 1.73 2.00 0.90 1.57 1. 77 0.92 1.47 1.30 1.32 0.87 1.77 1.83 1.60 1.91 0.87 Pl P2 On-Off Pl-P2 # Vs On-Off Secs Secs Pl-P2 Secs 1. 90 3.93 3.12 4.53 5.93 3.05 3.38 3.63 5.67 14.75 3.17 2.47 3. 77 2.08 0.73 1.12 8.43 4.07 9.90 1.70 2.63 3.08 7.83 2.75 4.27 4.00 3.67 5.67 7.48 5.07 0.85 4.34 2.92 0. 77 0.93 0.30 0.67 1.02 0.47 0.37 0.40 0.28 0.37 0.53 0.40 1.07 1.37 2.42 0.38 0.38 0.37 0.67 0.43 0.70 0.63 0.40 1.50 2.35 0.60 0.47 0.35 0.30 5.80 0 1 1 1 5 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 6 2.78 1.00 1.33 1.00 0.33 1.62 0.90 1.88 0.72 0.88 0.92 1.42 1.22 0.50 0.72 1.87 1.85 2.18 1. 77 1.57 1.43 0.82 1. 90 1.67 4.70 0.95 2.47 2.05 1. 72 0.65 0.89 0.67 1.49 1.08 1.40 0.85 The recording was lost of these portions of the cry fro m infant 16, from stimulus presentation to just before the onset of the first phonation.

PAGE 91

80 Notes--Table 3-9. St-Vl--Time in seconds (secs) from stimulus to first vocalization. # Vs St-Pl-# of vocalizations between stimulus and 1st phonation. St-Pl--Time (secs) from stimulus to 1st phonation. Vl-Pl--Time difference between the onset of a prephonatory vocalization and the following phonation. St-Vl/Pl--Tirne (secs) from stimulus to either the 1st vocalization or the 1st phonation. Pl On-Off--Duration of 1st phonation. Pl-P2--Time (secs) between end of 1st phonation and beginning of 2nd phonation. #Vs Pl-P2--# of vocalizations between end of Pl and beginning of P2. P2 On-Off--Duration of 2nd phonation.

PAGE 92

81 non-phonatory bursts in the period before Pl (i.e., #Vs S t -Pl) and before P2 (#Vs Pl-P2). Sixteen of the thirty-one infants, or about half, began the cry event with one to eight pre-phonatory vocalizations (X=2.4, SD=l.8), beginning slightly less than a second to more than four seconds (X=l.64, SD=0.75) after application of the stimulus. These vocal bursts generally were expiratory in nature, and often had the quality of a "cough." As may be seen, if the infant began the cry with a pre-phonatory burst, the latency value to the onset of phonation (f=2.58, SD=l.89) became inflated, since it actually provides information about the latency to two different events. However, the amount of time between the pre-phonatory event (when one occurs) and the onset of phonation (Vl-Pl, X=l.83, SD=l.91) provides information relative to the ability of the infant to resolve the "coughing" sequence and (vocally) exhale. It is perhaps noteworthy that this parameter is approximately the same as for the latency to the pre-phonatory burst. (It should be mentioned also that this comparison is obscured by the fact that infant 31 had an unusual cry sequence characterized by many pre-phonatory bursts and extremely short phonational durations. If the data from this infant is excluded from the sample, then X(Vl-Pl)=l.4 and SD(Vl-Pl)=0.83. Other statistics change as well: X,SD(#Vs/St-Pl)=l.07,1.28; X,SD(St-Pl)=2.32,l.27; X,SD(#Vs/Pl-P2)=0.48,0.99; X,SD(Pl-P2)=0.72, 0.56.) The fourth latency var i able (St-Vl/Pl, X=l.6, SD=0.87) is of interest also since it provides information on the response (i.e., sensory, neurological and motor processing) time from the stimulus to

PAGE 93

82 any first vocalization. In addition, this parameter is the most comparable to those examined in other studies involving latency (See Table 1-4). The mean values, which are derived from St-V l values and St-Pl values (where there was no pre-phonatory event), are essentially the same as those from the St-Vl data only. To be specif i c, calculation of the degree of difference using an unpaired t-test between the data for the two parameters indicate nonsignificance (t=0.85, P=0.42). That is, the time required for infants to process the stimulus and then respond was essentially stable, independent of the manner of the response (i.e., prephonatory vocalization or first phonation). Duration. The duration of both the first and second phonations were measured from the onset to the cessation of vocal/expiratory activity. As can be seen in Table 3-8, the mean and standard deviation of the first phonation were 4.34 and 2.92 secs. respectively, while for the second phonation the values were 1.49 and 0.85 secs. When a paired t-test was carried out between these two parameters, the difference was found to be significant at the P(0.01 level (t=S.16, P=0.00038). The relative energy expenditure in the cries (as evidenced through duration) is, as with several other parameters, consistently emphasized proximal to the initiating stimulus. The first phonation represents the abrupt release of air after the inspirational response to the stimulus, while the second phonation duration generally begins to conform to the regular patterning of the basic cry that appears to follow.

PAGE 94

83 The Perceptual Analyses As was discussed in Chapter II, certain characteristics of the cry tend not to be assessed in the types of acoustical and temporal analyses described above. Thus, in an attempt to more fully describe the cry act, two separate procedures were carried out. In the first, the investigator carried out a subjective analysis of each cry relative to the presence or absence of: roughness, vocal fry, extreme variations in frequency and register shifts (or breaks). In the second, more formal procedure, a panel of trained listeners rated each cry for levels of roughness and strain. Author's judgments of "cry quality" characteristics The author's judgements relative to four characteristics of cry quality are presented in Table L-1 of Appendix L. Roughness. A three level scale of roughness was applied. Out of 61 cries, five cries were rated as extremely rough, 29 were rated as moderately noisy and 27 cries had clear phonation. When the first and second cries were compared, first cries were more often extremely noisy (Pl=3, P2=2) or moderately noisy (Pl=l6, P2=13), and often had clear phonation (Pl=l2, P2=15). Vocal fry. Over 40% (25) of the cries were found to exhibit vocal fry during some portion of the signal. First cries were more than twice as likely (Pl=l7, P2=8) to include fry as second cries. Moreover, all those infants who exhibited fry in the second phonation also had exhibited it in the first. Variations in frequency. Over 20% (13) of the cries displayed an inordinate amount of frequency variability. Again, this feature was more likely to occur in the first phonation (Pl=lO) than in the

PAGE 95

84 second (P2=3). However, there did not appear to be a consistent relationship between Pl and P2 wavers. In all infants but one, wavers occurred in either the first or the second phonation, but not in both. Register shifts. Six infants displayed register shifts at some point in the cry. Five of the infants shifted upward. The sixth infant, who hyperphonated for most of the cry, made the transition into the modal register, and then achieved loft once more. Four of the register shifts occurred in the first phonation. Of the two occurring in the second phonation, one was part of a hyperphonation (mentioned above) and the other was part of a modal phonation which followed a hyperphonatonal first cry. Perceptions of roughness and strain As was indicated previously, high levels of noise (i.e., "dysphonation") often were present during phonation. Although it has correlates in such variables as FO standard deviation and jitter, phonational noise is a difficult parameter to measure. Indeed, it's physical specification (relative to cry) rarely has been addressed in the literature (Golub (1980) is a notable exception). Consequently, a small panel of highly trained listeners (clinical and experimental specialists in voice) was asked to judge each cry on a scale of one to five for two parameters (roughness and strain). Roughness was defined as perceived aperiodicity, noise or harshness, while strain was defined as a noticable increase in effort or difficulty due to constriction during vocal output. The averaged results are presented in Table 3-10. Individual responses are provided in Tables L-2 and L-3.

PAGE 96

85 Table 3-10. Perceptual responses of five trained listeners to each cry. Listeners were asked to rate the cries for strain and roughness along a 5 point scale (l:least-5:most). Mean and standard deviations for all responses are provided, as well as differences in mean response between the first and second phonation of a cry event. STRAIN Phonation Baby 1st 2nd 1st-2nd # X SD X SD Xl-X2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 x SD 4.2 0.4 5.0 0.0 2.8 0.8 4.4 0.5 5.0 o.o 3.2 1.1 2.0 0.7 4.6 0.5 4.8 0.4 1.8 0.8 4.4 0.5 3.0 1.2 2.8 0.8 2.2 0.8 4.6 0.9 2.6 1.1 4.4 0.9 4.6 0.5 4.8 0.4 4.4 0.5 2.8 0.4 2.2 0.8 4.2 0.4 3.0 0.7 5.0 o.o 4.4 0.9 3.2 0.4 4.6 0.5 2.6 1.1 2.4 0.9 1.6 0.9 3.2 0.4 1.8 0.4 1.6 0.5 3.8 1.3 3.6 1.1 1.0 0.0 1.0 o.o 2.8 1.6 2.8 0.8 3.4 0.5 1.4 0.5 1.6 0.9 2.2 0.8 4.4 0.9 2.0 0.7 2.0 0.7 3.0 1.0 1.4 0.5 3.2 1.3 1.8 0.4 1.2 0.4 2.4 1.1 1.2 0.4 3.0 1.0 2.8 0.8 1.2 0.4 2.6 1.3 2.8 1.3 2.8 2.5 2.6 1.5 1.0 3.2 1.2 0.6 1.4 2.2 1.0 1.8 2.0 -1.6 1.6 1.2 0.0 0.2 0.6 2.4 1.6 3.4 1.2 1.0 1.0 1.8 1.8 2.0 1.6 2.0 2.0 -0.2 -0.4 -1.0 1.2 1.1 ROUGHNESS Phonation 1st 2nd 1st-2nd X SD X SD ?1-X"2 2.6 1.5 4.8 0.4 4.0 1.0 3.6 1.1 4.4 0.9 3.0 1.0 2.0 1.0 3.6 0.5 4.4 0.9 1.8 0.4 3.8 0.8 4.6 0.5 3.0 0.7 1.4 0.5 3.6 0.5 3.8 0.4 4.0 0.7 4.8 0.4 3.8 0.8 3.4 1.5 2.8 0.8 2.0 0.7 4.2 0.4 2.6 0.9 3.2 1.3 4.6 0.5 2.8 0.8 3.8 1.3 3.2 0.8 2.8 0.4 2.2 1.3 2.5 0.9 2.4 1.1 2.0 0.7 4.4 0.9 2.4 0.9 1.0 0.0 1.2 0.4 3.8 0.4 3.4 1.1 4.0 1.0 2.4 0.5 2.0 o.o 3.4 0.5 2.8 0.8 3.8 0.4 3.6 0.5 3.6 0.9 1.4 0.5 3.6 1.1 2.6 0.9 2.2 1.1 2.8 0.8 1.8 0.8 4.0 0.7 3.0 0.7 1.4 0.5 3.6 0.5 4.6 0.5 4.2 0.4 2.4 1.3 o.o 2.4 2.0 -0.8 2.0 2.0 0.8 -0.2 1.0 -2.2 2.2 1.0 -2.0 0.8 o.o 0.4 1.2 2.4 -0.2 0.2 -0.2 1.4 0.8 -0.8 1.6 1.4 0.2 -1.4 -1.4 -0.2 0.5 1.3

PAGE 97

86 A paired t-test was used to test the difference in the responses for the first and second phonations of each parameter. The results clearly indicate (P(0.01) that the first cry is perceived as rougher (t=2.5, P=0.009) and more strained (t=6, P=0.00001) than the second. While the acoustic and temporal analyses quantify various aspects of the cry, those components that are not measured but are perceptible must be acknowledged as well. For example, a rating of phonational noise provides an index of the relative size of the noise component within the signal, and adds a dimension to data which might otherwise be indistinguishible during comparison procedures. The finding that first cries were perceived as both rougher and more strained than second cries impinges on the frequency information. Specifically, these perceptual data indicate that, in spite of similar FO values, phonational patterns for first cries are different than for second cries. That difference may be understood in terms of irregular vocal fold vibration, and greater tension in the laryngeal mechanism. Summary The data generated for this project are summarized below for each vector. Fundamental frequency The group mean (and mode) for cry fundamental frequency was found to be 505 Hz, with a 3 ST standard deviation. The range of individual means extended over two octaves from 312 to 1299 Hz. The range of detected frequencies within individual cries averaged 15 ST (SD=9 ST), or over one octave.

PAGE 98

87 The cries clustered into two frequency ranges--most (93%) were below 600 Hz (Y=465, SD=3.0), while a few cries (7%) were above 900 Hz (X=l080, SD=2.2). Of the four hyperphonational cries, three occurred as first phonations and one occurred as a second phonation followin g a h y perphonational first phonation. No significant differences between first and second phonations were found for any of the frequency parameters. The paired comparison of standard deviation values showed the strongest difference, achieving a level of only P=0.12. Jitter Values for percent of cry jitter ranged from 2.2 % to 25.3 % (X=9.0%, SD=4.0%), and was probably related to the level of turbulence or aperiodicity during phonation. First and second phonations differed significantly in percent of jitter, although this relationship was not strong (P(0.10). Formant Frequencies Mean and standard deviations (respectively) obtained for cry fo r mant frequencies were: Fl=929, 126; F2=1347, 214; F3=2235, 290; F4=3631, 452; F5=6043, 853. First phonation values were found to be significantly higher than second phonation values only for Fl (P( 0.01), indicating that the vocal tract configuration was approximately stable over time. Long Term Spectral Distribution Although individual spectral curves tended to vary substantially, a plot of averaged spectral data identified si x energy peaks, centered approximately at 460, 940, 1190, 2140, 3400 and 60 8 0 Hz. In general, the cries had a spectrura with most energy concentrated between 830 and

PAGE 99

88 2700 Hz. Energy level tended to be most variable and to drop off rapidly at higher frequencies (approximately 3600 Hz and above). The spectral curves of first and second phonation cries were nearly congruent in the central frequency region (between Fl and F3). However, Pl exhibited greater spectral energy levels in the lower and upper frequencies. The differences in low frequency energy probably were reflective of increased aperiodicity and lower signal-to-noise ratios for Pl, while high frequency differences were likely due to changes in mouth opening. Peak Intensity Level Peak absolute intensity levels for each cry ranged from 77.5 to 99.0 dB (X=88.8, SD=S.5) 24 inches away from the microphone. No difference was found in this parameter due to phonation sequence. Timing Latency. After the application of the stimulus, about half of the infants responded with an average of 2.4 (SD=l.8) non-phonational vocalizations (a "cough" or inspiratory burst); the other half responded immediately with the first phonation. Independently of the manner of response, the latency time (X=l.6 seconds, SD=0.87 seconds) to onset was approximately the same. If the infant did respond first with a non-phonational vocalization, then approximately the same amount of time (X=l.83 seconds, SD=l.91) elapsed from that first vocalization to the first phonation, although this second segment of time tended to be more more variable. Duration. The first phonation within a cry episode avera g ed 4.3 seconds ( SD=2.9), while the second phonation averaged 1.5 seconds (SD=0.85). The time lapse between the first and second phonations

PAGE 100

89 averaged 0.9 seconds (SD=l.l). Approximately one third of the infants produced inspiratory bursts during this interval. Perceptual Judgements Author's judgements of cry features. More than half the cries (34) were judged to have some level of roughness. Of that number, five were judged as extremely rough. Nearly half of the cries (41 % ) exhibited vocal fry during some portion of the cry. Approximately one fifth of the cries were rated as extremely variable in frequency. Finally, six of the 61 cries (9.8%) exhibited temporary shifts in register during phonation. Five of those cries were upward shifts from (modal to loft), while one was a downward shift (loft to modal). Each of these four features was more pronounced for the first phonation than for the second: roughness was noted slightly more often, fry occurred more than twice as often, frequency variation more than three times as often, and register shifts twice as often. Both second phonation register shifts followed high frequency first phonations. Panel's judgements of roughness and strain. Five highly trained voice professionals rated first cries as consistently rougher and more strained than second cries within an episode.

PAGE 101

CHAPTER IV DISCUSSION The discussion to follow contrasts/compares the present results wi t h previous findings about cry and the hypotheses established for this research--as well as the larger issues revealed by the patterns wi t hin the data. The results pertinent to each hypothesis are discussed, and a model of crying in normal newborn infants is presented. Basic Hypotheses Fundamental Frequency Production The FO data obtained in this study are striking both in their similarity to and in their differences from previously obtained values. To be specific, the mean values for all cries approximated those reported in earlier studies, while the range of individual means was broader than those reported by other investigators. As may be seen from Table 4-1, the overall group mean and standard deviation (505 Hz and 4 ST, respectively) agree with values obtained for pain cries in all earlier studies. Moreover, the range of individual mean FO values obtained for infants in this study (312-1299 Hz or 27 ST) is broader than but encompasses any reported in the literature for normal / low risk infants since Flatau and Gutzman (1906) (392-2637 Hz or 33 ST). 90

PAGE 102

91 Table 4-1 -Comparison of fundamental frequency values for nor mal obtained for normal pain cry in previous studies and in this study. Study N Ringel & 10 Kluppel, 1964 Wasz-Hockert 60 et al., 1968 Michelsson,1971 so Tenold et al., 9 1 9 74 Lester,1976 1 2 Lester,1978 12 Zeskind & 12 Lester,1978 Lester & Zeskind,1978 40 Zeskind,1981 13 Zeskind & 19 Lester,1981 x Hz 413 530 620 51 8 308 466 468 606 524 488 SD ST } : ;.; : / ( * SD Hz 30 110 ;. 32 83 54 302 29 100 Range 290-508 410-650 :c: 440-800 -----------------------------------------------------------------Klepper, 1985 All Phonations 61 505 4 171 312-1299 Low Phonations 57 465 2 66 312-595 Hi Phonations 4 1080 3 169 938-1299 1st Phonations 31 517 5 215 359-1299 2nd Phonations 30 493 3 110 312-936 ---------------------------------------------------------------

PAGE 103

92 However, as noted in Chapter III, the frequency data for cries clustered into two types of phonation: 1) below 600 Hz (N=57, X=465 Hz, SD=2 ST) and above 900 Hz (N=4, X=l080 Hz, SD=3 ST)). The mean of the lower frequency (or modal) cries was 1.4 ST below the mean value for all cries, and had a standard deviation half that of the all cries. These data suggest a narrower range of values (from about 400 to 530 Hz) for the means of most normal pain cries than a survey of the literature would indicate. Indeed, although the values for this modal group are congruent with the values reported from most studies, they are considerably lower than those presented in some (e.g., Michelsson, 1971 (620 Hz); Lester and Zeskind, 1981 (606 Hz)), and higher than those in others (e.g., Lester, 1976 (308 Hz)). By contrast, the few hyperphonational (loft) cry responses which occurred were in a frequency range fully an octave higher than the lower modal cries. Of course, hyperphonation has been well documented in the cry literature (Truby and Lind, 1965; Michelsson et al., 1980; Golub, 1980; Lester and Zeskind, 1981), but has been associated most often with at-riskedness or pathological involvement. Thus, the fact that three out of 31 (9.7%) apparently normal infants exhibited hyperphonational cries was somewhat unexpected. However, each of the infants who responded to the pain stimulus with high frequency cries produced only modal cries subsequently in the cry episode (i.e., after at least the second phonation). Furthermore, other cry episodes of the same children, produced during the course of the Brazelton administration, rarely contained hyperphonation. Thus, high frequency crying appeared to be a special response pattern that was elicited by the stimulus.

PAGE 104

93 Five additional (i.e., modal) cries also exhibited high frequency crying in the form of brief, intermittant register shifts. These shifts were noted during the investigator's perceptual classification of cr y features. The samples which contained the shifts were characterized by broad frequency ranges, "detected" high frequency values that were abnormally elevated, and "normally low mean FO and detected" low FO values. That is, the presence of high frequency energy for short periods of time did not substantially alter the cry's modal overall mean FO, but was reflected instead in range-related parameters. In all, nine out of the 61 total cries (14.8 % ) exhibited some evidence of high frequency activity. In turn, seven of these nine cries were first phonations, indicating that 22.6% of all first phonations had a high fundamental frequency component. The other two cries displaying hyperphonation followed first cries that were totally hy p erphonational. That is, the likelihood that a phonation would contain a high FO component was greater if it: 1) was the first phonation in the cry episode, or 2) followed a high frequency first phonation. In any case, the FO data from this study suggest that: 1) normal cries are bi-modally distributed. That is, while the preponderence of the cries of normal healthy neonates do indeed fall into the lower (modal) range of values reported in previous studies, a small percentage are hyperphonational in nature, (i.e., an octave above normal levels of phonation). Indeed, if the values for normal hyperphonations are included in the

PAGE 105

94 calculation of the group mean FO, then all mean values previously reported for normal infants are similar. 2) hyperphonation (either continuous or intermittent) in the normal pain cry appears to be a common phenomenon, occurring in more than one fifth of all first phonations, but only occurring one-third as often in second phonations. Although the clustering of cry frequency data provides the appearance of a bi-modal distribution (with a 300 Hz gap between the two clusters), it seems that these two modes represent a continuum of laryngeal function. That is, hyperphonation can be facilitated from modal dynamics by a quanta! increase in laryngeal muscular tension. Clearly, the stress induced in some babies by the pain stimulus can be su f ficient to precipitate the generalized sympathetic input (and consequent constriction) described by Zeskind and Lester (1982) (and outlined in Chapter I). Most normal infants are obviously able to integrate and adapt to the stimulus, and they respond modally. But those who are sensitive to the stimulus, and who cannot repress an extreme whole body response, may be expected to manifest the additional tension associated with the production of hyperphonation. Jitter While adult jitter values have been generated from samples of sustained and clear phonations, controls of this type are impossible to implement in the study of infants. Consequently, the values obtained for jitter (2.2%-25.2%, X=9.0%) in this study were substantially higher than the available data on adults (approximately 0.5%-1.0%). However, these data may provide an index of the regularity of vocal fold vibration in infants.

PAGE 106

95 The extremely high values found for jitter may be attributed to: 1) the noise which so often was present in the cry signal and 2) the lack of neuromuscular control (relative to adults) in the neonate. Aperiodicity or turbulence in vocal fold vibration manifests in random cycle-to-cycle changes in FO. This irregularity must result in dramatic increases in jitter values. Furthermore, even if noise were not a consideration in jitter, the variability between cycles would still be higher than is present in adults. Even the clearest, most stable cry phonations within this sample exhibited jitter levels of 2% or more (more than twice as high as adult levels), suggesting that infants exert less control over the laryngeal musculature than do adults. However, it must also be remembered that the crying infant is responding to an intrusion; comparison to adults phonating in a laboratory may be unsound. In addition, while there was no clear relationship between jitter and perceptual judgements of harshness, those samples which were highest in jitter level (e.g., 4.2) also were rated as rough. Although the connection between jitter level and harsh quality is only tentative, it is in keeping with Wendahl's (1963) hypothesis that high levels of jitter are perceived as harshness. Certainly, turbulence or aperiodicity during phonation could account for both phenomena. In any case, the values generated for jitter did appear to be internally consistent as an indication of the stability of vocal fold dynamics. Formant Frequencies The formant frequencies obtained in this study are quite different from values reported by other investigators. Five resonant

PAGE 107

96 peaks generally were found for each phonation, with means for all phonations at 929, 1347, 2235, 3631 and 6043 Hz, in strong contrast to the two or three resonant freqencies reported in earlier studies (See Table 4-2). This five peak structure was corroborated by the mean long term spectral curve generated for all phonations. However, while many of these resonances could be confirmed using t-f-a spectrographic analysis (the principal tool of previous cry formant investigators), others often were difficult to detect with this second procedure. It is possible to categorize the data from other studies in relation to the values reported here. This categorization assumes that other investigators relied exclusively upon a two or three formant scheme. However, the resolution available through long term spectral analysis has clearly distinguished the location of peaks that might be overlooked using the t-f-a spectrograph. In any case, when the data from this and earlier studies are schematized in the manner presented in Table 4-2, they are roughly consistent within that structure, thus providing some evidence from past research that the formant structure and values generated in this study are valid. As has been noted previously, it would be difficult and of questionable purpose to ascribe "vowel qualities" to the formant values, since these peaks were derived from long term analysis of cries (i.e., over time), and the vowel-like quality of the vocalization often was changing. Instead, an attempt was made to relate the formant values to the resonant characteristics of the supralaryngeal vocal tract. Using the vocal tract transfer function, vocal tract lengths of 9.1 cm and 6.2 cm were calculated for resonant frequencies of 929 and 1327 Hz respectively. While both these values

PAGE 108

97 Table 4-2 Comparison of mean formant frequency values obtained in this study with those from previous. studies. Although other investigators presented obtained values in terms of a three formant scheme, their data are tabled here in relation to the five formant structure noted in the present investigation. Standard deviations are provided in parentheses. Study Klepper, 1985 Ringel & Kluppel,1964 Lieberman et al .,1971 Colton & Steinschneider, 1980 Gardosik et al. Fl 929 (126) 1100 F2 1347 (214) 1450 1623 (360) 1550 F3 2235 (290) 2450 F4 3631 (452) 3300 3249 (536) 3109 F5 5043 (853) 5354 (825)

PAGE 109

98 may be acceptable, there is as yet too little anatomical or perceptual information relative to the infant vocal tract and vowel-like sounds to accurately determine the relationships involved. Long Term Spectra While detailed data for long term spectra are new to this area, and therefore must be considered tentative, the comparisons between the spectral curves for Pl and P2 were particularly interesting. That is, while the spectral levels were essentially similar in the central portions of the spectrum (between FO and F2), Pl exhibited significantly greater energy in the low frequencies (probably due to turbulence and a resulting lower signal to noise ratio) and in the up p er frequencies (probably due to greater mouth distention, or a reduction in tissue elasticity due to muscular tension during crying). These curves provide strong evidence that the first phonation is, in general, more extreme and stressful than the second phonation. That is the spectral characteristics for Pl reflect greater muscular stress than do those for P2. Intensity The intensity levels obtained in this study were considerably higher than those reported by other investigtors. Specifically, the infants who were tested produced cries at a mean absolute peak intensity level of 88.9 dB SPL (SD=S.6) 24 inches from the microphone. This compares to Ringel and Kluppel (1964) who reported 84 dB twelve inches from the microphone and Colton and Steinschneider (1980), who reported 74 dB nine inches from the microphone. The discrepancies in level are puzzling, since both earlier studies presumably incorporated measurements from calibrated recording systems. Colton and

PAGE 110

99 Steinschneider mention that they filtered their sample recordings prior to measurement (because of background noise), which possibly lowered the overall energy level of their signals and accounted for their relatively low values. Thus, cry peak intensity levels were fully 5 dB higher 1 foot further away from the subjects than those levels reported from earlier studies. These differences in signal level are probably due to the manner of calibration. Latency In general, the latency data from previous studies have been remarkably consistent. Most studies of cry have employed t-f-a spectrographic analysis, in which time is represented along the x -axis, thus affording good temporal resolution. The data obtained in this study are more detailed in their specification of temporal segments than has been available in most reports, but agree approximately with findings of the past. In general, the time lapse between the application of a pain stimulus and the onset of any vocal response is 1.6 seconds. This value is noteworthy since two types of vocal responses were evident. About half of the infants responded first to the poke with a pre-phonatory vocalization (e.g., an inspiratory burst, or a cough), which was followed by the first phonation. The other half responded directly with the first phonation. The time lapse tended to be consistant and independent of the form of the response, su g gestin g that the time required to internally process and respond to a pain stimulus poke is approximately stable for healthy infants. Without exception, the data from other studies of normal infants appear to

PAGE 111

100 support this hypothesis. Mean latency values from all previous investigations are well within one standard deviation of the mean value found in this study. However, although the amount of time from the stimulus application to the first vocalization (i.e., either pre-phonation or phonation) may be consistent, the amount of time from the stimulus to the first phonation may vary considerably. While half of the infants responded with a first phonation directly (as noted above), the other half responded with two intervals: 1) to the pre-phonatory vocalization (Vl) and 2) from Vl to the first phonation (Pl). The mean time lapse generally required for infants co complete this second interval (i.e., Vl-Pl) was approximately the same as the time required to vocally respond to the stimulus. However, the variability of the second interval was extremely broad, with a standard deviation greater than the mean. Thus, the the interval from the stimulus application until the infant began exhaling (i.e., the expiratory phonation) was quite unpredictable; it ranged from 0.5 co 10.1 seconds. The fifth hypothesis stated that latency was expected to be approximately 1.4-1.6 seconds, reflecting the infant's stimulus processing and respose time. The latency value (1.6 seconds) obtained in this study was consistent with that values and with all other previous studies on cry latency. This suggests that the (sensory, neural, motor) processing time required for a healthy infant to respond to the stimulus is reasonably stable, independent of stimulus (different studies have utilized different stimuli) and independent of the manner of the vocalized response.

PAGE 112

101 Duration The extent of the first phonation was variable also (SD=2.9 sec.), and consistently longer (X=4.3 secs.) than the cry durations reported by most investigators. As discussed in Chapter I (See Table 1-5), studies of the first phonations of normal infants have reported mean values 1.2-6.5 seconds. The values generated in this study are consistent particularly with those reported by Wolff (1969) (4.1 secs.) and by Thaden and Koivisto (1980) (5.1 secs.). By contrast, the values for the second phonation are considerably and consistently shorter (X=0.89, SD=l.08) than those for the first. Thus, the data reported here agree with Hypothesis 6. Cry durations were extremely variable and in the range expected. Duration of the first cry was longer than the second. However, cry duration was not correlated with cry latency. Finally, two categories (with about half of the infants in each group) of initiating cries were observed: those which were pre-phonational vocalizations and those which were phonational. Other Issues Although the primary purpose of this project was to establish pain cry values for a sample of normal healthy infants, it became clear that certain patterns were crucial to an understanding of the general problem of crying. Central to this understanding was t h e intrusive nature of the pain sti m ulus itself and the physiolo g ical dyna m ics of the response. Specificall y the pro g ression fro m e x treme to m oderatin g response characteristics suggested that the infants were actively recovering

PAGE 113

102 from, rather than adapting to, the stimulus. In general, the data indicated that: 1) first phonations tend to be rough, variable and strident, with laryngeal muscular tensions poised at the edge of transition into falsetto production; 2) second phonations tend to be more moderate and stable, less variable and exhibiting less general tension. As has been discussed, several authors have suggested that the first phonation of the cry episode represents a stimulus-bound response, while later phonations represent a stimulus-independant or "basic'' cry. One of the goals of this project was to examine that premise by attempting to discern differences between identical parameters for the first (Pl) and second (P2) cry. It should be emphasized that P2 is assumed to be transitional (between the stimulus-bound and the stimulus-independant responses), rather than representative of the stimulus-independent response. That is, the characteristics of P2 are expected to be less extreme that those of Pl, but perhaps more extreme than those of P3 (which were not analyzed in this study). Specifically, it was hypothesized that fundamental frequency values, jitter levels, formant frequencies, phonation durations, and peak intensity levels would be different. In addition, it was expected that listeners would rate the first cry as rougher and more strained than the second. Phonation. Contrary to expectations, the FO parameters showed no differences due to sequence. The possible exception was in the standard deviation of the two cries, in which the variability was greater for Pl. However, the difference in that case did not reach significance (P=0.12), and can at best be called a weak trend.

PAGE 114

103 Another FO-relateri parameter of variability, jitter, was significantly higher for Pl than P2 (although this relationship was weak (P(0.10). Together, these variables indicated greater variation around a central FO for Pl than P2. These data were strengthened further by the comparison of spectral curves for Pl and P2, which showed significantly higher levels of spectral energy for Pl in the frequency range below the fundamental, again relating to turbulence and aperiodicity in the production of the laryngeal source tone. The perceptual judgements of roughness and strain also were related to the quality of phonation. These data strongly indicated (P( 0.01) that listeners found the first cry rougher and more strained than the second. Taken in concert with the findings for standard deviation, jitter and low frequency spectral energy, the perceptual data suggest that noise and phonational variability within the cry signal is reduced from the first to the second cry. In turn, this finding can be interpreted as an adaptation to the pain stimulus, or a movement from an extreme (and less controlled) response toward homeostasis. Additional evidence of the underlying differences between phonations could be found in the frequency (of occurrance) of hyperphonation in first and second cries. High fundamental frequency phonation (900-1300 Hz) (loft) was found (either continuously or intermittently) in one-sixth (9) of all cries examined. Of these nine cries, seven were first phonation cries and two were second phonations that followed high frequency first phonations. That is, the likelihood that a phonation would contain a high fundamental frequency

PAGE 115

104 component was greater if it: 1) was the first phonation in the cry sequence, or 2) followed a high frequency first phonation. Thus, hy p erphonational activity during crying was most likely to occur closest to the pain stimulus, while the probability of occurrance decreased rapidly with time and sequence of phonation. Formant frequencies. In the comparisons of formants between Pl and P2, only Fl showed a significant difference due to sequence; i.e., Pl was lower. However, as is clear in the comparison of spectral curves, F3 and F4 are (non-significantly) one-sixth of an octave higher for Pl than P2. Since the vocal tract length and approximate configuration must have remained constant from one cry to the next, it is probable that the first cry was performed with greater mouth opening and concomitant pharyngeal constriction than the second cry. As was discussed earlier, this same explanation (from Fant, 1960) was used by Colton and Steinschneider (1980) to explain higher Fl values than were predicted from expected vocal tract size. In the discussion at present, however, it is hypothesized that the articulatory gestures are more extreme for the first than for the second cries, which resulted in slightly higher formant frequencies as well as higher spectral levels (due to resonance) at those frequencies. Peak intensity level and duration. Contrary to expectations (Hypothesis 4), the intensity levels of first phonations were approxiwately equal to those of second phonations. However, Golub (1980) has pointed out that if vocal ener g y is considered over time (ra t her than in terffis of respiratory power), then the first cry is seen to require greater effort than the second. That is, energy expenditure may be considered as the product of intensity level and

PAGE 116

105 duration of phonation. Given that assumption, the infant clearly expends more energy in the first cry than in the second. To summarize and respond to the seventh hypothesis, i.e., that first phonations differed from second phonations, the data for Pl relative to P2 showed: 1) greater variation in F0 (as evidenced in jitter level, standard deviation and low frequency spectral energy). 2) greater levels of perceived noise (i.e., aperiodicity) in the cry. 3) higher incidence of hyperphonation (loft) activity 4) slightly higher F3 and F4 values, with greater energy levels for those peaks, indicating possibly greater tension in the supraglottal tissues and/or distension of the supraglottal cavities (Fant, 1960; Zemlin, 1981); 5) equal peak intensity levels but longer phonational durations, resulting in greater levels of energy expenditure. For each parameter discussed above, the value of the parameter has been more extreme closer to the stimulus and less extreme (or non-existent) as the stimulus become less proximal in time. While this configuration of data strongly suggests that more extreme responses are associated with the stimulus in question, it remains speculative to attribute these characteristics to an intrusive or stressful (e.g., pain) stimulus specifically, as opposed to, for example, a graduated but aversive stimulus like hunger. In any case, the cry can be shown to moderate somewhat from the first to the second phonation, with the incidence of certain cry cha r acteristics (such as hyperphonation, and roughness) tending to

PAGE 117

106 subside. Unfortunately, although the beginning of the trend can be observed in the first two cries, the achievement of "vocal homeostasis" (the "basic" cry) undoubtedly is not achieved until later in the cry episode. The Physiology of the Pain Cry As might be expected, the laryngeal conditions appropriate for production of the aperodicity or high frequency phonations (loft or falsetto register) that are associated often with the pain stimulus are different than those for modal phonation. Frequency. Although explanations of the laryngeal mechanisms underlying loft production ar~ not free from debate, it is possible that the vocal folds have been tensed and lengthened as much as possible. In addition, the posterior portions of the vocal folds (in the region of the vocal processes) may firmly approximated, not vibrating with the rest of the folds. That is, the intrinsic laryngeal muscles (e.g., the cricothyroid, interarytenoids, thyroarytenoid, the posterior cricoarytenoid) are extremely tense, the vocal folds are lengthened, thinned and stiff, and the length of the vibrating glottis is severely shortened (Brodnitz, 1959; Stevns, 1977; Zemlin, 1981). Thus, the dynamics of the loft register probably involve considerably more laryngeal tension than do those of modal phonation. Noise. Zemlin (1981) notes that aperiodicity during phonation also seems to be due to "excessive tension of the vocal folds, coupl~d with undue ru~dial compression," as well as a tendency to "initiate phonation with a glottal attack" (p. 223) (i.e., with the vocal folds already adducted). In cases such as this, the combined straining of

PAGE 118

107 the folds combined with a forced expiration may cause them to vibrate asynchronously, or at least to cause turbulence during phonation. Al t ernatively, sudden and extreme changes in respiratory flow would be likely to interrupt patterns of periodicity during modal phonation, causing turbulance or "dysphonation. In studies of the perception of roughness using both synthesized and natural voice stimuli, a number of investigators (Wendahl 1963, 1966a, 1966b; Coleman 1969; Horii, 1981) have provided evidence that the magnitude of roughness judgements is related to the frequency differences between adjacent cycles (i.e., jitter). Thus, it appears that aperiodicity in the cry, like high frequency phonation, is a phenomenon which arises from a state of extreme laryngeal tension. Given the generalized somatic tension exhibited by the infant in response to the pain stimulus (Truby and Lind, 1965), tension within the musculature responsible for high frequency or noisy vocal output is not surprising. From a biological standpoint, it must be remembered that the larynx is primarily a respiratory valve which, among other functions, prevents foreign bodies from entering the airway. Sudden contraction of the laryngeal musculature may be the infant's best response to a threatening stimulus, since it results in closure and protection of the airway. However, the "modified Valsalva maneuver," in which the airway is partially or completely closed, results in elevated (sub-glottal) pressures. The interaction between these pressures and the laryngeal mechanism could explain many of the pre-phonational vocalizations (e.g., coughs and squeaks) which occur in the latency period between the stimulus and the onset of phonation.

PAGE 119

108 A Model of Cry Production The data obtained in this and earlier studies, and established relationships about the dynamics of human sound production allow for the following model of cry production. Clearly, the intensity of the cry stimulus determines the intensity of the infant's response, and whether that infant can integrate and adapt to the stimulus in a controlled fashion. A high intensity stimulus (e.g., pain) precipitates global input from the sympathetic division of the autonomic nervous system, resulting in generalized muscular contractions. This action includes 1) constriction of the intrinsic and extrinsic laryngeal musculature (to effect protection of the airway), which results in increased tension in the vocal folds and 2) contraction of the respiratory musculature, which results in rapid inspiration, followed by forced and uninhibited exhalation. In some cases, this configuration of events may produce an inspiratory glottal stop, which sounds like a hiccup. If a high enough level of laryngel tension has been effected, the expiratory phonations may be very high frequency (loft) in nature. Alternatively, if the laryngeal tension is variable, or if respiratory flow is variable through a transitionally tense larynx, the resultant frequency production may be intermittently modal and loft. If the laryngeal tensions are inadequate to produce hyperphonational frequencies, the variabilities of respiratory flow across the modally adducted vocal folds m ay result in intermittent aperiodic or irregular vibrations, perceptible as noise. Finally, if the laryn x has achieved the modal configuration, but the muscles of expiration have not been inhibited, the expiratory reserve volume will likely be exhaled. This action may

PAGE 120

109 result in a low respiratory flow with "pulse'' (fry) vibrations by the vocal folds. Subsequent to the initial sympathetic innervation, discrete parasympathetic input to the vagus acts to inhibit the contraction of the cricothyroid muscle, probably the principal muscle of frequency change. The lessening of tension within this muscle insures a modal vibratory cycle in subsequent phonations. Similarly, inhibition to the respiratory musculature insures more regular tidal respiration during crying. Both of these general changes point toward the amelioration of the stimulus-bound cry and the establishment of the "basic" cry. The Use of the Pain Stimulus Implications The arguments presented above attempt to relate the obtained cry data to the pain stimulus, and the physiological configurations that arose in response to that stimulus. The extreme vocal responses suggested for the first phonation are in contrast to the somewhat moderate values for the second phonation. Stimulus-specific cry response characteristics have been noted previously in the literature, although little acoustical data has been accumulated until now to support this idea. Truby and Lind (1965) described three different cries -the basic cry (phonation), harsh or raucous crying (dysphonation) and high frequency crying (hyperphonation) -which corresponded (respectively) to increasing levels of stress and distress. Wolff (1969) also used the concept of a basic cry, noting that hunger, anger or pain elicited permutations of that general form. Lester and Zeskind (1982) proposed a classification schema which defines distress cries based upon the

PAGE 121

110 "intensity quality" of the stimulus. For example, a pinprick is a "high intensity stimulus which elicits a maximal response" (i.e., a pain cry) while "more graduated" stimuli (such as hunger or wetness) "will probably produce cries that do not conform to the characterstics of the maximal pain cry response" (p. 153). It is important to recall that the pain stimulus is incorporated into the Brazelton (and other tests) to help assess the infant's ab i lity to cope with and resolve aversive intrusions into his/her environment. Correspondingly, the parameter values for the first and perhaps the second phonations of cry episodes may be extreme in their characteristics because they represent that pain response. That is, the first vocal response to the pain stimulus provides a "vocal window to the infant's coping mechanism, but is not representative of "basic" crying and its underlying physiology. Instead, the pain response seems to be a permutation of the modal cry, combining the basic cry form with those elicited by the physiological changes imposed by the stimulus. Consequently, while an examination of the early stages of the cry event may provide information relevant to the infant's response to the stimulus, the later portions of the crying episode (e.g., the fourth, fifth and sixth phonations) may provide information about the infant's physiology unobscured by the response of the stimulus. It should be considered also that the pain stimulus was chosen for this and other cry studies because it is easily controlled and applied. That is: 1) the stimulus itself is a known and repeatable quantity and 2) it can be applied for the desired response at will. However, while the experimenter can control the parameters of the

PAGE 122

111 st i mulus application, it is far more difficult to assess and control the internal state (e.g., emotional) of the infant, which is crucial in the processing of the stimulus, and the eventual vocal output. In d eed, the presence of these internal variables may serve to confound even the stimulus-related characteristics of the cry. Thus, this argument points also to the difficulties of making accurate assessments of underlying physiolgy from a stimulus-related cry response. Cry and Pathology A Perspective For many years, the concept of cry-based diagnostic systems for infant pathology has received much attention, especially from the popular press. Many competent studies have been carried out which have shown positive correlations between particular cry features and different levels of at-riskedness. However, like the Lester and Zeskind studies (reviewed in Chapter I) which demonstrated elevated FO levels for high-risk normal infants, the data from this study indicates that loft frequency production may be part of the normal spectrum of responses for an aversive stimulus. Thus, the facility for assuming a high frequency configuration of cry production may be a viable indicator of the infant's general sensitivity or vulnerability to stressful stimulus, while his ability to recover to the "modal" cry may provide an index of his ability to adapt and achieve homeostasis. Thus, it is reasonable to expect that compromised infants g enerally are more vulnerable to and less likely to recover easily from stressful stimuli. Consequently, a greater incidence of high fre q uency phonation should be expected from them.

PAGE 123

112 These views are in concert with those stated by Lester and Zeskind (1982): in contrast to the cry of the damaged infant, in which the fundamental remains extremely high with little variability, infants at risk show many, and often dramatic shifts in pitch, or hyperphonation (Truby and Lind, 1965), especially during the pain cry. It is as if the pain stimulus sent the stressed infant into hyperphonation, probably because the nervous system lacks the inhibition (due to low vagal input) which prevents the nuscles in the larynx from contracting ... What distinguishes the at-risk from the not at-risk infant is the frequency and duration of hyperphonation. Most low-risk infants rarely react to a pain stimulus with hyperphonation or quickly recover if they havethat reaction .. In contrast to the phonated cry of the low-risk infant, the high-risk infant cry shows initial hyperphonation followed by other shifts in pitch, sometimes returning to hyperphonation. It is questionable whether cry-based diagnostic assessment of compromise can become a reality at all. However, the statements above represent a genuine attempt to determine global configurations within the cry response that are representative of levels of health state. Undoubtedly, future research will be able to clarify different patterns that may provide additional delineations of status. CONCLUSIONS An acoustic, temporal and perceptual analysis was carried out on the first and second pain cries from 31 normal, healthy neonates between 26-30 days of age. In general, the results indicated the following. 1) While most pain cries exhibit a modal" fundamental frequenc y between about 400 and 530 Hz, normal cries may also range an octave higher. The occurrance of high frequency cries is probably due to the generalized ~tress induced by the sti m ulus, and the individual infant's susceptibility to it (throu g h

PAGE 124

113 tightening of the laryngeal musculature). Thus, the presence of hyperphonation in the cry may be an indicator of the vulnerability of the infant and his overall stress level. A corollary to this is that the ease with which hyperphonation may be induced is indicative of the sensitivity of the infant. 2) The variability (as detected by jitter) of the laryngeal cry signal is substantially higher than that common to adult speech. The greater portion of this variability can be accounted for by turbulence in the cry signal (again due to laryngeal tension). However, the large variability evident even in relatively noise-free crying indicates that muscular control of the vocal mechanism is considerably less developed than it will be in the more mature individual. Furthermore, the tentative relationship between jitter and perceived roughness within the cry makes jitter another promising parameter for the monitoring of stress in the infant. 3) Although previous investigators have reported a three formant frequency structure, the spectral analysis utilized in this study indicated a five peak structure to the cry signal. However, the obtained values are not predicted by the vocal tract transfer function. It is difficult to accurately relate the formant values to vocal tract sources since there is inadequate anatomical information relative to the neonatal vocal tract and acoustic / perceptual data relative to infant sounds. 4) The vocal energy output of a crying neonate with intact respiratory function can be expected to approximate 89 / +/5.5 dB 24 inches from the microphone.

PAGE 125

114 5) Healthy infants take approximately 1.6 seconds to initiate a vocal response to a pain stimulus, although the response can take the form of either pre-phonatory burst, or a phonation. This latency ti m e reflects sensory, neural and motor processin g ti m e necessary for the response. 6) The duration of the first phonation tends to be about 4 seconds but is quite variable. First phonations are virtually always of greater duration than second phonations. 7) First phonations tend to be more FO variable, perceptually rougher and more strained, with greater vocal tract distension than second phonations within a cry episode. That is, first phonations appear to reflect characteristics of a response to the initiating stimulus, while subsequent phonations appear to moderate toward stimulus-independent cry. The use of acoustical analysis to derive information about underlying physiology is confound~d by the choice of early phonations in the cry sequence and stimuli which produce extreme responses. 8) The cry analyses carried out in this project appear to be most sensitive to the infant's level of generalized muscular tension, and to the level of variability inherent in certain physiological processes (e.g., phonation, respiration). The data suggest that cry analysis does not as yet have adequate resolution for application as a diagnostic screening tool for pathology. However, cry analysis can non-invasively reveal vocal correlates of stress in babies, and should be applied to that task.

PAGE 126

APPENDIX A COMPARISONS OF THE CRIES OF NORMAL AND PATHOLOGICALLY INVOLVED INFANTS The perceptual consensus that different-sounding cries are indicative of different health states has a counterpart in A / T studies, which have sought to demonstrate, based on the parameters described in Chapter I, that normal cries are different from cries of infants with specific pathologies. These studies have been carried out by several research groups with the intention of developing operational cry-based differential diagnostic systems, although to date no such system has been developed. Significant differences have been found between normal infants and those with: asphyxia (Wasz-Hockert et al., 1968; Michelsson, 1971; Michelsson et al., 1977), hyperbilirubinemia (Wasz-Hockert et. al., 1971), brain damage (Karelitz and Fisichelli, 1962; Fisichelli and Karelitz, 1963; Lind et al., 1965; Wasz-Hockert et al., 1968; Michelsson et al., 1977), oro-pharyngeal anomalies (Lind et al., 1965; Massengill, 1969; Michelsson et al., 1975), low birth weight (Michelsson, 1971; Lester, 1976), genetic defects (Lind, 1965; Fisichelli et al., 1966; Vuorenkoski et al., 1966; Wasz-Hockert et al., 1968; Ostwald et al., 1970; Lind et al., 1970), S.I.D.S. (Stark and Nathanson, 1975; Colton and Steinschneider, 1981), relationship to S.I.D.S. infants (Colton and Steinschneider, 1980; Colton et al., in press), herpes encephalitis and congenital hypothyroidism (Michelsson and Sirvio, 1976), and drug-addicted mothers (Blenick, Tavolga and Antopol, 1971). 115

PAGE 127

116 The sheer number of studies which have found differences between the cries of normal and abnormal infants is compelling. However, as Zeskind and Lester (1978) point out, the differences are not between normal and abnormal cries, but rather between the cries of infants with previously known physical characteristics. That is, the cry samples in these studies have generally been obtained and compared for differences on the basis of known physical characteristics of the infants. Cries have not been obtained randomly from infants and sorted on the basis of cry characteristics.

PAGE 128

APPENDIX B THE PHYSIOLOGY OF CRY AND ACOUSTIC / TEHPORAL CORRELATES If it is assumed that each of the physiological systems described in Chapter I (i.e., nervous syst~m, respiratory system, laryngeal system and supra-laryngealtract) has a correlate in the sound produced by the infant, then it might be reasoned that: 1) The nervous system controls and interacts with all physiological systems involved in sound production. Consequently, normal functioning of the respiratory, laryngeal and supra-glottal systems presupposes normal neural function, and results in a normal cry signal. In the case of the normal cry, a "window" into neural function might be obtained by the parameter which is most unrelated to the mechanical aspects of vocalization, i.e., the timing or temporal sequencing characteristics of the cry event. Conversely, any abnormal cry characteristic potentially implicates neurological stress or dysfunction. Stark and Nathanson (1975) and Lester (1978) have pointed out that infants who are known to be neurologically impaired tend to display a constellation of abnormal cry characteristics that would be unaccounted for by a single "structural" defect. One example of this is cri-du-chat (cry-of-the-cat), a disorder characterized by an extremely high-pitched cry. While it wight be tempting to attribute the cry to laryngeal pathology, it is now known that 117

PAGE 129

118 the cry is symptomatic of a chromosomal syndrome which affects CNS control and, finally, laryngeal function (Aronson, 1980). 2) The respiratory system provides a binary signal -that is, the individual is either inhaling or exhaling. While expiratory phonation is characterized by generally good periodicity and robust intensity, inspiratory phonation manifests in a relatively turbulent, low intensity signal. Furthermore, the amplitude of the cry signal and its duration reflect the intensity of the exhalation (i.e., the rate at which air is exhaled from the lungs) and the time taken to complete that event. Thus, acoustic intensity and durational cry characteristics reflect respiratory function. 3) The laryngeal system utilizes the airflow from the respiratory system to produce a source tone -the fundamental frequency (FO) of the cry represents the number of vibratory cycles occurring each second at the vocal folds. Further, by examining the cycle to cycle variation in FO, an insight into the inertial characteristics of the vocal folds might be obtained. However, it must be remehlbered that since laryngeal function is directly related to respiratory function, the output signals are related, i.e., a specific range of FO may only be appropriate for a given absolute intensity. 4) The supra-glottal tract is formed by a series of chambers which impose characteristic resonances on the source tone; spectral characteristics of the output waveform (harmonic overtones and formant frequencies) reflect the signal shaping characteristics of those resonant chambers.

PAGE 130

119 While respiration and laryngeal function tend to be dynamic operations, the articulatory dynamics of the neonate are limited to gross tongue thrusts and opening/closing movements of the mouth. These actions affect the output signal, of course, but in a manner less interactive and more independent than the other systems. Consequently, it might be supposed that abnormal spectral characteristics in the cry signal are more likely due to structural anomalies within the supra-glottal tract, rather than abnormal functioning of the two systems which precede it.

PAGE 131

APPENDIX C LETTER TO LOCAL PEDIATRICIANS February 14, 1984 Dear Dr. Some time ago we spoke to you about our study to establish acoustic/temporal cry data for normal newborns (26-30 days of age). At that time, you graciously agreed to encourage the participation of the parents with new infants who have come to your practice. This packet contains materials for you to examine and also to hand out to the parents. We have enclosed: 1) an abstract of the project (which has been submitted as a grant to the National Institutes of Health (NIH), and which is Brian Klepper's doctoral dissertation), 2) a sample of the informed consent form which each parent will be asked to sign at the experimental session, and 3) a supply of letters which may be given to parents as an explanation of the project. We seeK the participation only of children who have been determined to be medically healthy. Developmental normality will be ascertained during the experiment by administration of the Brazelton Neonatal Behavioral Assessment examination. Naturally, we will forward a copy of the examination results to your office for the child's medical records, and all parents may request a summary report and explanation of their infant's Brazelton developmental profile. Thank you again for your help. Should you have specific questions about any part of the project, please feel free to call Susan Armstrong at home (373-1782), or Brian Klepper at home (377-3421) or at IASCP (392-2046) Sincerely, Susan Armstrong, Ph.D Assistant Research Scientist Brian Klepper, M.A Research Assistant 120

PAGE 132

APPENDIX D LETTER TO PARENTS OF PROSPECTIVE SUBJECTS Dear Parent, Here at the University of Florida, we are carrying out what we consider to be a very interesting s t udy -we hope that you will find it interesting as well. In fact, we trust that you will find it worthwhile enough to assist us by letting your child participate in the project. We are studying the characteristics of sounds produced by healthy newborn babies. An eventual goal of this research program is to develop a system that would use baby sounds as a diagnostic screening tool for certain types of medical problems. Our primary concern at this point, however, is to obtain information on normal infants. Consequently, there are three criteria that must be met before a baby can be included. First, each must have been judged to be medically normal -as was established by your pediatrician (before he was kind enough to refer us to you). Next, each infant, at birth, must have had a normal weight for his/her length. Since this information must be included in our research, we must ask your permission to access your child's birth and medical records. Naturally, this (and any other) information obtained during this study will be held in the strictest confidence. Finally, we must be sure that your child is developmentally within normal limits. In order to verify that he/she is, we will perform the Brazelton Neonatal Assessment Scale. This test is used as a standard developmental tool throughout the country, and is designed to show how a baby compares to other healthy babies of his/her age. It provides a description of the infant's neurological reflexes, social interaction skills, muscle tone, and of his/her ability to be consoled. Ordinarily, this test is quite expensive (around $50), since it takes a highly trained -and certified -evaluator approximately one-half hour to administer (which would be all the time required of you) and another half-hour to score. Naturally, there would be no charge for this examination and, in return for your cooperation, we will be most pleased to provide your pediatrician with the results. In addition, if you request it, we will provide you with a summary report and explanation of your child's developmental profile. Lastly, it should be noted that the Brazelton poses no risk of any kind to your child. Since the objective of this study is to gain information on sound production characteristics of babies, we will tape record the entire Brazelton examination -extracting the pertinent sounds for use in our analysis. In order to provide the most favorable and comfortable environment for both the testing and the recording, we have specially prepared a sound treated booth for this purpose; it is located in Room 42 of the Arts and Sciences 121

PAGE 133

122 Building, University of Florida. Although this is where we prefer to carry out our procedures, we are willing to cometo your home if that is your preference. Once you volunteer, your child will be assigned a number, which will be used instead of his/her name. This procedure will assure that yo u r child's privacy will not be violated. Only the investigators listed above and their research assistants will have access to the data. If you agree to participate in this study by signing the attached informed consent form, you will be free to withdraw at any time -and without prejudice. If you have any questions about any aspect of the study, please feel free to ask them. May we count on your participation? Please give Dr. Armstrong a call between 8 AM and 9 PM at 373-1782 and she will be happy to set up an appointment with you. We would appreciate hearing from you. Thank you. Sincerely, Susan Armstrong, Ph.D. Assistant Research Scientist Brian Klepper, M.A. Research Assistant

PAGE 134

APPENDIX E THE PONDERAL INDEX A helpful measure which provides information concerning the adequacy of fetal growth (i.e., the appropriateness of weight for length) is the Ponderal Index (PI), which is a ratio of the infant's bi r th weight to body length. Use of the PI circumvents two major problems: 1 ) Birthweight @ay be seen in terms of its appropriateness for length rather than achievement of an arbitrary value (e. g 2500 grams). Use of the PI also facilitates differentiation between babies who are long but thin and infants who are short but heavy. When other complications are noted in the newborn, it has been shown that infants who are excessively overweight or underweight for birth length are at increased risk for early mordidity (Miller and Hassanein, 1971; Ounsted and Ounsted, 1973; Lubchenco, 1976). 2) Even the best estimates of calculated gestational age (CGA) may be as much as 2 weeks in error (Casaer and Akiyama, 1970). The PI serves as a better identifier of fetal growth and development than CGA. In addition, the PI stabilizes after 37 weeks gestation, so that evaluation of chronologic age in nearor full-term infants is unnecessary ( M iller and Hassanein, 1 9 71). Infants scoring between 3-97 percentile on birth wei g ht ( m = 5.8-10.1 lbs., f = 5.8-9.4 lbs), length (m 18.22 1.5 inches, f = 17.5-21.1 inches) and PI ratio (2.21-2.81) may be considered to exhibit normal fetal growth. 123

PAGE 135

APPENDIX F THE BRAZELTON NEONATAL BEHAVIORAL ASSESS M E N T SC A LE While medical e x aminations and the ponderal index p rovide pri m ar y physical infor m ation about the health of the infant (includin g the identification of gross abnormalities), they are not generall y sufficient to identify mild dysfunction of the central nervous system or the ability of the infant to respond to behavioral stimuli in any meaningful way (Als et al., 1977). Since the early 196O's, several assessment scales have been developed to meet this need. The Brazelton Neonatal Behavioral Assessment Scale (1973) has emerged as a standardized vehicle by which a neonate's behavioral competence may be assessed, and is the most comprehensive and widely used test of its kind to date. The following passage is taken from Als et al., 1977, as a statement of theoretical framework upon which the Brazelton is constructed. Five characteristics distinguish the BNBAS. It is, first, based on the conceptualizations of the neonate as complexly organized, capable of defending himself from negative stimuli, of controlling interfering motor and autonomic responses in order to attend to important e x ternal stimuli, and of elicitin g stimulation from his environment necessary for his species specific m otor, emotional, social and cognitive develop m ent. The e x amination attempts to assess these capacities by providing typical interactive 124

PAGE 136

125 situations. Second, the examination tries to set up optimal conditions for eliciting the newborn's best performance; this attempt places extra demands in training examiners to reliability. Third, with the infant's behavior as the guideline for the order of administration of items, the general procedure of the examination is to bring the baby up from a light sleep to an alert state, then into more active states, to crying and down to an alert inactive state again. A fourth consideration is that the examination utilizes the states of consciousness. Actions and reactions vary markedly, depending on the infant's state. The pattern of states through which the infant moves reflects his capacity to modulate his states of consciousness in order to be available for social and cognitive stimulation. In the brief periods of alertness a newborn has, the assessment of orientation and responsiveness to animate and inanimate, visual and auditory stimuli has high priority in the BNBAS. Finally, this type of assessment prevents labeling with a simple behavioral quotient, with its danger of influencing caretaker expectations. It is implicit in the conceptualization of this scale that individual score points will not tell whether an infant is normal or not. It is the pattern of behavior clusters, four in our most recent formulations, which becomes decisive in gaking a judgement of 'at risk' or normal. The Brazelton is comprised of 46 items -20 reflex items are

PAGE 137

126 used to assess the infant's neurological intactness, and 26 behavioral items are used to assess the newborn's interaction repertoire. The behavioral items may be reduced into four dimensions: Interactive Processes, M otoric Processes, Organizational Processes and Physiological State. Values obtained during the assessment of these four dimensions ~ay not be distilled further into a single behavioral quotient, but must be used as part of a profile of the individual's behavioral status (Als et al., 1977). Each of the four behavioral dimensions may be defined as follows (from Als et al., 1977): 1) Interactive Processes reflects the newborn's ability "to attend to and process simple and complex environmental events." 2) Motoric Processes assesses the infant's ability "to maintain adequate tone, to control motor behavior, and to perform integrated motor activities." 3) Organizational Processes detects how well the neonate can maintain a "calm, alert state despite increased stimulation." 4) Physiological State is designed to determine "how much at the mercy of the physiological demands of his immaturity and the recovery from labor and delivery is the infant at this time? How well is the infant able to inhibit startles, tremors, and interfering movement as he becomes aroused or as he attends to social and inanimate stimuli? How m uch e x haustion enters into the picture of state modulation ? How much is the baby at the mercy of the environment? How vulnerable is he to continued stimulation?"

PAGE 138

127 Curriculum Vitae Susan Armstrong, Ph.D. Date of Birth: June 30, 1945 Position: Assistant Research Scientist, Institute for Advanced Study of the Communication Processes State University of New York State University of New York Unive r sity of Florida Professional History -Primary BA MA PhD 1969 1974 1982 Early Childhood Educ. / Human Growth & Dev. Children's Developmental Services (CDS), Dept. of Pediatrics, Division of Neonatology, University of Florida Director of Outlying Developmental Clinics and Research Associate, 1983 Chi l d Development Evaluator, 1982-3 Member, Home Based Intervention Program, 1981-2 Member, Infant Development Project team, RNICU, 1979-81 Director, University Gardens Preschool, 1977-9 Professional Experience -Secondary Co-Investigator, Florida Infant Health Development Project, 1981 Co-Investigator, GOLS Project (NIMH), 1981 Consultant, F.A.C.U.S. State Training Team, 1981 Coordinator, Kindergarten Workshop for Alachua County Schools, 1978 Early Childhood Developmental Curriculum Advisor to the Alachua County Kindergarten Project, 1977-8 Certification and Training Programs Brazelton Neonatal Assessment Certificate of Reliability, Harvard University, 1982 Infant and Toddler Assessment Seminar, Harvard University, School of Medicine Continuing Education Program, 1931 Seminar of training for the Dubowitz Neonatal Assessment Instrument with Dr. Lillian Dubowitz, University of Florida, 1981 Developmental Assessment Practicum, Children's Developmental Clinics, Shands Teaching Hospital, Gainesville, Fla. Mastery of the Denver Developmental Inventory, the Bayley Scales of Infant Development, the Vineland Scales of Social maturity, The Brigance Inventory of Development and the Peabody Picture Vocabulary test were accomplished, Graduate E x perience CDS, Division of Neonatology, Dept. of Pediatrics, Colle g e of M edicine Research Assistant, 1981-2 Research Fellow, 1980-1 Dept. of Curriculun and Instruction Research Assistant, 1976 Student-Faculty Liason Assistant, 1975

PAGE 139

128 Publications Armstrong, S. (1983) Infant-Parent Interaction Analysis and Developmental Outcome for a High Risk Premature Population: A Longitudinal Analysis, Dissertation Abstracts, 43:9. Presentations The Preschool and Developmental Programming, F.A.C.U.S. Annual Wor k shop, 1979. The Development of Pre-Reading Skills, Conference on Reading Leadership in Fla. Public Schools, 1978

PAGE 140

APPENDIX G INFORMED CO N SE N T FORM Project Title: Acoustic / Temporal Characteristics of Nor m al Neonatal Cry Investigators: Brian Klepper, M.A. Address: ASB-63 Phone: 392-2046 Name of Child: Susan Armstrong, Ph.D. Child's Date of Birth: Name of Parent: We are asking that you allow your child to participate in a stud y aimed at developing information about the characteristics of sounds produced by normal healthy newborns. However, before we do so, it is important that you understand the exact nature of the project. It will be necessary for us to review your infant's m edical records for information about his/her weight and length at birth. Second, a certified evaluator will perform a Brazelton Neonatal Behavioral Assessment examination of your baby. This procedure will help us to categorize your baby according to developmental criteria of the study. This test is used as a standard developmental tool throughout the country, and is designed to show how a baby compares to other healthy babies of his/her age. The Brazelton's 2 6 behavioral items and 20 reflexes are useful in enabling the evaluator to describe the infant's neurological status, social interaction skills, m uscle tone, and of his/her ability to be consoled. Administration of the test involves application of a number of different sti m uli which m ay 129

PAGE 141

130 result in a cry (such as pulling the baby to a sitting position, placement of the child in a prone position, heel stick (a gentle poke on the heel by the rubber end of a cuticle stick), eliciting a tonic neck reflex, a Moro reflex, and a defensive reaction to a cloth placed over the eyes). However, it should be emphasized that these stimuli will be in no way harmful, nor will they pose any risk to your child. The entire test will be conducted in a sound treated room and wi l l require approximately 30 minutes of your time; the results will be forwarded automatically to your pediatrician. Furthermore, the Brazelton evaluator will be available for a full explanation of the results. The examination will be tape recorded, and selected sounds from this recording will be acoustically analyzed at a later date. If you agree to permit your child's participation, he / she will be assigned a number. This number will be used to identify his/her data. Thus, any information about him/her gathered during this study will be confidential. Only the investigators listed at the top of this form and their research assistants will have access to the data. Unfortunately, we are not able to provide monetary co@pensation in exchange for participation in this study. However, subjects participating in this study will have the Brazelton examination carried out free of charge. Moreover, parents may receive the results of that examination upon request.

PAGE 142

131 Participation in this research is voluntary, and y ou are free to withdraw your child at any time. If you have any questions about an y aspect of the study, please feel free to as k them. **** *************** I h ave read and I understand the procedure described above. I a g ree to participate in the research and I have received a cop y o f this description. Si g nature of Parent Date Signature of Witness Date

PAGE 143

APPENDIX H TAPE RECORDER CALIBRATION FOR INTE N SITY M EASURE M E N TS In order to determine the correspondence between a m p litude of the cr y signal and sound pressure level (dB SPL), it was be necessar y to ca l ibrate the recording system for the m easurement of intensity. The Quest Electronics microphone, attached to the Quest Electronics Model 215 Sound Level Meter was calibrated usin g a piston-phone, with an output leading to the input of the tape recorder. Calibration at ad d itional frequencies (500, 8 00, 1000, 1600, 2500 and 4 0 00 Hz ) was carried out through the use of a Phonic Ear Hearing Aid Test Cha m ber, and th microphone was found to have a flat frequency response. The piston-phone generates a 1000 Hz sine wave at 110 dB. Since the output of the sound level meter provided a voltage of 1.2 volts at full scale (any attenuation level), the 1000 Hz signal was used to calibrate the tape recorder input at O VU. RMS voltages both in and out of the tape recorder were measured and matched. Thus, the intensity level of the recorded signal was calculated based on the known reference voltages and SPLs. 132

PAGE 144

APPENDIX I SOMATIC DATA FOR INDIVIDUAL INFANTS Table I-1. Somatic data for individual infants. ----------------------------------------------------------------------------s age sex wt lgth PI ~ (days) (gr.) ( c .; 1) --------------------------------------1 30 m 3884.32 54.61 2.39 2 28 m 4139.49 53.34 2.73 3 28 lil 3203.86 52.07 2.27 4 29 m 2806.92 49.53 2.31 5 28 f 3657.50 52.07 2.59 6 28 m 3515.73 52.07 2.49 7 28 m 4876.66 55.88 2.79 8 29 f 3175.50 50.80 2.42 9 28 f 3033.74 49.53 2.50 10 28 m 4423.02 55.88 2.53 11 28 f 3941.03 55.88 2.26 12 29 m 3203.86 49.53 2.64 13 27 ill 3912.67 55.88 2.24 14 30 m 3827.61 54.61 2.35 15 29 f 3685.85 54.61 2.26 16 28 m 3090.44 52.07 2.18 17 28 f 4281.26 54.61 2.63 18 28 m 3%9.38 53.34 2.61 19 28 f 3402.32 53.34 2.24 20 29 Ul 3714.20 53.34 2.45 21 30 m 4026.08 52.07 2.65 22 29 Itl 3685.85 52.07 2.61 23 30 ra 3260.56 50.80 2.49 24 29 f 4026.08 53.34 2.65 25 28 t 3232.21 50.80 2.47 26 27 m 3799.26 54.61 2.33 27 28 m 3317.27 53.34 2.19 28 27 m 3799.26 52.07 2.69 29 30 m 3629.15 53.34 2.39 30 26 m 2920. 33 48.26 2.60 31 29 m 3544.09 52.07 2.51 --------------------------------------X 28.4 22m 3644.69 52. 77 2.47 SD 1.0 9f 465.58 2.07 0.17 --------------------------------------*P.I. Ponderal Index 133

PAGE 145

APPENDI X J BNBA S (DEVE L OPMENTAL) D A T A FOR I N DI V ID UA L I NFAN TS T a ble J-1. Br a zelton Scores for Indivi d u a l Su bje cts. Three lines o f values are pr e sen te d f or ea ch s ubject. T he first line represents raw scor es on e a c h o f th e Bra z elton i te m s (a key to the ite m s is provi de d at the end of t h e table ) the secon d line r e presents adjusted scores, and the third line provides me ans for each cluster of scores Subject, clust e r a nd it em identification are p rovided at the end of the table. ------------------------------------------------------------------------------------------------------------------------CLUSTER if 1 2 3 4 5 6 7 ITE M s 1 2 3 4 1 2 3 4 5 6 1 2 3 4 5 1 2 3 4 1 2 3 4 1 2 1 .) ------------------------------------------------------------1 ,. 1 : 1 8 5 6 5 3 8 6 6 7 8 5 7 6 5 4 8 5 7 7 6 3 5 1 #( 1 8 5 6 5 8 8 6 6 3 2 5 4 6 5 3 3 5 7 7 4 8 5 1 1 6 66 4. 40 4. 5 6 .7 5 5 66 1 2 6 7 1 8 5 5 4 7 8 6 7 7 8 5 7 6 5 4 9 5 6 6 1 3 5 1 6 7 1 8 5 5 4 7 8 6 3 3 2 5 4 6 5 3 9 5 6 6 9 8 5 1 4 66 6 16 3. 8 4 .5 6 5 7. 3 1 3 8 1 4 5 5 6 4 6 8 7 8 6 6 6 6 4 8 7 6 5 1 2 5 0 8 1 -;, 4 5 5 6 4 6 2 3 2 4 5 6 4 3 8 7 6 5 9 9 5 0 4.5 4.8 3. 4 4.5 6. 5 7. 66 0 4 1 2 4 5 4 5 4 6 7 6 9 5 7 6 7 5 8 6 6 5 1 3 4 1 i': 1 2 4 5 4 5 4 6 3 6 1 5 4 6 3 2 8 6 6 5 9 8 6 1 1 4.00 4.20 3.75 6 2 5 7 .66 1 5 1 9 5 8 5 9 9 6 6 6 6 5 6 3 4 3 8 9 9 2 6 3 4 1 1 9 5 8 5 9 9 6 6 6 6 5 5 3 ,.. 4 8 9 9 2 4 8 6 1 0 1 7 50 5. 8 0 4 5 7 .0 6 00 1 6 1 8 6 6 7 7 6 7 5 7 7 6 5 6 6 3 8 4 5 1 1 2 2 2 1 7 6 6 7 7 6 3 5 3 3 4 4 6 4 4 8 4 5 1 9 9 7 2 1 6 6 7 3. 6 0 4 5 4 5 8 33 2 7 i, -/; # ,: 1 8 5 8 9 9 9 6 7 6 7 5 8 6 6 4 5 1 6 3 9 y 0 .) ,, 1 8 5 8 9 9 9 6 3 6 3 5 2 6 4 3 5 1 3 6 7 1 1 0 1 8 0 4 6 3 75 3 7 5 3.0 0 -------------------------------------------------------------13 4

PAGE 146

135 Table J-1 -continued. --------------------------------------------------------------8 'it: -le 1 8 7 7 8 7 8 5 6 7 6 5 7 6 6 4 8 6 7 1 6 2 6 2 1 8 7 7 8 7 8 5 6 7 4 5 4 6 4 3 8 6 7 1 4 9 4 2 1 7.5 5.4 4.25 5.5 5.67 2 9 7 8 1 6 4 5 5 6 5 7 5 7 8 5 7 6 7 3 8 4 5 9 1 3 6 0 7 e -; 1 6 4 5 5 6 5 3 5 3 2 5 4 6 3 4 8 4 5 9 9 8 4 0 5.33 5.16 3.6 4.25 6 .5 7.0 0 10 9 3 8 1 7 7 6 7 7 7 6 5 6 7 5 7 6 5 4 8 5 7 1 1 5 5 2 9 8 8 1 7 7 6 7 7 7 6 5 6 3 5 4 6 5 3 3 5 7 1 9 6 5 2 6.5 6.83 5 4.5 5.25 6.67 1 11 8 5 1 7 6 6 5 6 5 6 6 7 8 6 7 6 6 4 9 5 7 7 1 5 5 0 8 5 1 7 6 6 5 6 5 6 6 3 2 4 4 6 4 3 9 5 7 7 9 6 5 0 3.5 5.83 4.20 4.25 7.0 6.66 0 12 1 7 6 6 6 7 7 5 5 8 8 6 5 6 7 3 9 8 7 9 1 5 6 1 -; :: "i': 1 7 6 6 6 7 7 5 5 2 2 4 4 6 3 4 9 8 7 9 9 6 4 1 1 6.5 3.6 4.25 8.25 6.33 1 13 1 8 6 7 5 8 8 5 6 7 8 4 5 6 5 2 8 8 8 1 1 4 6 0 1 8 6 7 5 8 8 5 6 3 2 4 4 6 5 5 8 8 8 1 9 7 4 0 1 7.0 4.0 5.0 6.25 6.66 0 14 -;,': _., 1 8 5 7 7 8 8 7 5 7 8 5 5 4 5 3 9 8 8 9 1 2 5 0 1 8 5 7 7 8 8 3 5 3 2 5 4 4 5 4 9 8 8 9 9 9 5 0 1 7.16 3.6 4.25 8.5 7.66 0 1 5 i: 8 8 7 7 7 8 7 5 6 7 7 6 6 4 4 3 3 8 9 9 1 3 4 0 8 8 7 7 7 8 7 5 6 3 3 4 5 4 6 4 8 8 9 9 9 8 6 0 8 7.33 4.2 4.75 8.5 7.6 0 16 1 8 7 7 7 8 8 5 6 9 7 5 7 5 4 2 8 7 9 9 1 5 4 2 ,'< 1 8 7 7 7 8 8 5 6 1 3 5 4 5 6 5 8 7 9 9 9 6 6 2 1 7.5 4.0 5.0 8.25 7.0 1 17 i, -..: 1 8 8 8 8 9 9 7 7 3 6 6 5 6 6 4 9 9 6 1 1 2 5 3 .. 1 8 8 8 8 9 9 3 3 3 6 4 4 6 4 3 9 9 6 1 9 9 5 3 1 8.33 3.8 4.25 6.25 7.66 3 18 -/: 1 9 5 9 5 9 9 5 6 7 8 4 6 4 4 2 8 9 5 2 1 2 4 1 1 9 5 9 5 9 9 5 6 3 2 4 5 4 6 5 8 9 5 2 9 9 6 1 1 7.67 4.00 5 00 6.00 8.00 1 19 ')~ 1 8 6 7 5 8 7 5 6 8 7 5 7 6 6 4 9 5 5 8 1 3 6 0 1 8 6 7 5 8 7 5 6 2 3 5 4 6 4 3 9 5 5 8 9 8 4 0 1 6.83 4.2 4.25 6.75 7.0 0 --------------------------------------------------------------

PAGE 147

136 Table J-1 -continued. --------------------------------------------------------------20 ;'< 1 8 6 6 4 6 6 5 5 9 7 6 8 6 5 5 9 4 5 8 1 2 7 1 -<*l 8 6 6 4 6 6 5 5 1 3 4 2 6 5 2 9 4 5 8 9 9 3 1 1 6.00 3.6 3.75 6.5 7.00 1 21 7 1 5 6 4 5 5 4 5 5 7 6 6 8 6 6 5 8 5 6 7 1 2 7 0 7 1 5 6 4 5 5 4 5 5 3 6 4 2 6 4 2 8 5 6 7 9 9 3 0 4 4.83 4.6 3.5 6.5 7 0 22 1 5 5 5 6 6 4 5 5 7 7 5 7 6 5 4 8 8 7 5 1 2 4 0 1 5 5 5 6 6 4 5 5 3 3 5 4 6 5 3 8 8 7 5 9 9 6 0 1 5.17 4.20 4.50 7 8.0 0 23 1 6 6 8 6 8 9 5 5 6 8 6 5 6 5 3 9 7 8 7 1 2 5 0 1 6 6 8 6 8 9 5 5 6 2 4 4 6 5 4 9 7 8 7 9 9 5 0 1 7.16 4.4 4.75 7.75 7. 28 0 24 1 8 6 6 5 8 5 4 4 6 7 5 8 6 5 4 8 2 4 5 1 2 7 2 1 8 6 6 5 8 5 4 4 6 3 5 2 6 5 3 8 2 4 5 9 9 3 2 1 6.33 4.4 4 4.75 7 2 25 4 1 6 7 7 7 8 8 6 6 6 7 5 7 6 4 3 9 6 5 5 1 4 5 2 4 1 6 7 7 7 8 8 6 6 6 3 5 4 5 4 4 9 6 5 5 9 7 5 2 2.5 7.16 5.2 4.25 6.25 7.0 1 26 ,'< 1 8 7 8 8 8 8 6 4 5 6 6 5 6 8 3 7 6 6 1 1 5 6 0 1 8 7 8 8 8 8 6 4 5 6 4 4 6 2 4 7 6 6 1 9 6 4 0 1 7.83 5.0 4.0 5.0 6.33 0 27 1 5 8 7 7 8 8 6 4 4 3 5 7 6 4 3 9 3 5 1 1 2 5 1 ;~ 1 5 8 7 7 8 8 6 4 4 3 5 5 6 6 4 9 3 5 1 9 9 5 1 1 7.16 5.4 5.2 4.5 7.66 1 28 x 1 8 7 6 5 8 8 6 5 7 5 5 6 6 3 3 7 6 7 1 1 2 7 0 1 8 7 6 5 8 8 6 5 3 5 5 5 6 6 4 7 6 7 1 9 9 7 0 1 7.0 4.8 5.25 5.25 6.33 0 29 1 6 6 7 9 8 8 7 5 7 7 6 7 6 6 3 7 4 3 1 9 9 3 1 -I< 1 6 6 7 9 8 8 3 5 3 3 4 4 6 4 4 7 4 3 1 9 9 3 1 1 7.3 3.6 4.5 3.75 7.0 1 30 9 6 1 5 6 6 5 7 4 5 7 6 8 8 7 7 9 3 5 9 4 2 7 3 9 6 1 5 6 6 5 3 4 5 3 4 2 2 3 1 9 3 5 9 6 9 3 3 5.2 5.4 3.8 2.0 6.0 6.5 3 31 1 7 4 5 5 5 4 5 4 7 6 6 8 6 6 4 8 4 5 1 4 3 5 2 1 7 4 5 5 5 4 5 4 3 6 4 2 6 4 3 8 4 5 1 6 8 5 2 1 5.0 4.4 4.25 4.50 6.33 2 ------------------------------------------------------------------------------------------------------------------------------

PAGE 148

137 Notes--Table J-1. Subject, Cluster and Item Identification SUBJECT ITEMS ID CLUSTER 1 CLUSTER 4 1. JS 17. KO HABITUATION RAN G E OF ST A TE 2 A T 18. R W 1. Li g ht 1. Pea k of E x cite m ent 3 GK 1 9 LP 2. R attle 2. R apidit y of Buil d u p 4 JO 20. CI 3. Bell 3 Irritab i lit y 5 KE 2 1. RS 4. Pinprick 4. Labilit y of State 6 DE 22. ES 7 KB 23. TP CLUSTER 2 CLUSTER 5 8 WR 24. JG ORIENTATION REGULATIO N OF ST A TE 9 LW 25. RR 1. Inani m ate Visual 1. Cuddliness 1 0 JS 26. KB 2. Inanimate Auditory 2 Consolabilit y 11. KS 27. NH 3. Ani m ate Visual 3. Self Quietin g 12 PO 28 JW 4. Ani m ate A uditory 4. Hand to M outh 13 C N 29. JM 5. Visual & Auditory 14. AR 30. DK 6. A lertness CLUSTER 6 1 5. M D 3 1. CT AU TO N OMIC R EGUL A TIO N 1 6 J N CLUSTER 3 1. T re m ors M OTOR PERFORHA N CE 2. St a rtles 1. Tonus 3. Skin 2. Maturity 3. Pull-to-Sit CLUSTER 7 4. Defense REFLEXES 5. Activity 1. Reflexes Abnor m al

PAGE 149

APPENDIX K LONG TERM POWER SPECTRAL VALUES FROM INDIVIDU A L CRIES Table K-1. Lon g ten n spectral data for each o f the cry sam pl es. V al ues represent mean total energy levels for each bin in -d B re: 1.00 volt. The legend at the end of the table provides t h e frequency ranges encompassed by each bin. ----------------------------------------------------------------------------------------------------------------------------------------------Bin ii Cry Salilple ii 1.1 1.2 2.1 2.2 3.1 3.2 4.1 4.2 5.1 5.2 -----------------------------------------------------------------------5 41. 61 45.39 43.30 49.32 42.70 52.24 44.64 47.14 43.30 50.66 6 41.12 44.64 42.14 49.32 43.30 52.24 44.64 48.16 43.94 50.66 7 38.99 41. 61 40.65 46.22 42.14 54.18 44.64 46.22 42.14 4 8 .16 8 42.14 43.94 41.61 49.32 39.37 54.18 45.39 46.22 42.14 44.64 9 42.46 45.81 44.76 51.45 38.11 55.43 45.02 47.65 42.42 43.39 10 40.88 46.22 48.74 54.12 38.11 55.43 41.88 46.68 41.12 49.99 11 40.65 46.22 47.65 53.21 35.65 54.18 45.89 4 8 .23 39.82 52.24 12 43.04 46.22 48.74 74.83 34.91 60.20 53.67 50.20 38.90 47.90 13 43.73 39.66 43.39 45.29 36.19 42.52 57.55 52.67 38.62 2 8 .91 14 35.34 32.07 44.64 43.57 28.69 27 .11 53.21 52.24 45.89 37.84 15 28.66 28.01 48.48 55.43 31.51 47.47 44.04 51.12 50.20 60.20 16 29.44 39.18 47.48 52.89 32.54 42.46 44.97 48.55 46.53 62.21 17 37.36 43.97 45.49 52.33 30.96 53.23 51.06 49.32 51.12 56.06 18 42. 8 6 45.85 50.49 53.30 33.40 57.82 51.85 49.66 44.53 45.09 19 39.43 35.37 44.52 45.45 30.69 37.56 50.48 49.19 42.27 29.49 20 31.12 25.67 37.36 43.14 25.38 35.22 49.57 48.62 40.00 40.94 21 29.34 26.72 35.04 48.90 27.67 47.80 40.42 49. 8 0 37.74 41.49 2 2 39.02 41.77 33.78 34.76 29.21 36.07 46.94 45.52 35.62 33.54 23 42.85 44.65 42.58 46.07 28.35 30.12 46.47 43.44 37.66 41.42 24 30.50 36.97 47.42 53.20 30.34 52.23 41.35 41. 73 36.03 46.19 25 24.04 39.30 53.78 50.09 34.49 51.33 34.58 36.40 37.90 37.67 26 24. 77 32.45 56.55 57.73 28.16 51.19 34.46 28.42 32.75 37.19 27 21.87 28.96 57.23 60.59 33.37 38.32 29.31 27.41 28.85 32.40 28 26.90 30.70 56.01 75.59 39.24 39.59 27.06 24.30 24.39 37.59 29 27.58 34.08 51.53 65.65 34.31 36.53 30.73 32.50 29.19 46.33 30 30.69 33.45 51.04 55.88 32.87 37.70 39.45 35.00 34.88 45.56 31 27.36 27.80 49.32 65.10 31.25 37.58 37.46 39.41 34.13 40.68 32 23.59 25.59 40.95 86.53 28.21 40.46 34.68 34.20 32.97 36.43 33 33.46 34.20 43.12 92.44 31.44 42.28 28.38 29. 8 7 34.71 53.40 34 36.52 37.51 48.46 94.32 36.40 38.09 27. 8 2 31.69 40.05 50.11 35 36.59 37.62 53.19 69.96 42.22 47.65 33.68 39. 8 4 45.94 50.47 36 33.33 40.14 57.79 64.28 57.58 59.23 35.57 42.48 4 9 .29 51. 32 37 38.80 40.01 60.29 81.35 61. 74 65.1 9 37.14 45.42 51.54 52.08 -----------------------------------------------------------------------138

PAGE 150

139 Table K-1. continued. -----------------------------------------------------------------------Bin If Cry Sample II 6.1 6.2 7.1 7.2 8.1 8.2 9.1 9.2 10.1 10.2 -----------------------------------------------------------------------5 45.39 52.24 47.14 48.16 45.39 47.14 38.62 50.66 45.39 43.40 6 45.39 54.18 47.14 50.66 46.22 47.14 39.37 50.66 47.14 41. 61 7 43.94 54.18 47.14 50.66 46.22 46.22 38.26 52.24 47.14 39.37 8 44.64 54.18 48.16 52.24 47.14 47.14 38.62 54.1 8 48.16 38.62 9 40. 78 55.54 48.74 52.24 48.74 46.78 41.13 54.18 19.32 39.58 10 33.56 54.18 49.32 54.18 49.32 44.64 41.88 54.18 50.66 38.27 11 34.64 56.68 44.46 54.18 49.99 45.02 42.14 54.18 49.29 40.01 12 38.27 76.59 38.44 76.59 75.62 48.90 43.57 57.19 45.89 42. 72 13 36. 91 44.29 28.61 50.74 69.12 52.22 33.02 19.83 36.29 43.96 14 43.00 28.62 38.62 32.49 42.59 47.19 36.17 43.00 32.10 41. 75 15 47. 77 31.28 47.14 61.71 50.32 29.50 46.47 42.90 48.44 31.63 16 44.41 53.19 55.01 77 .00 56.68 40.58 42.74 57.85 54.18 26 .49 17 45.71 60.20 52.4 8 77 .oo 60.20 47.04 43.62 59.32 53.70 40.10 18 39.33 57.33 36.36 62.33 60.21 45.25 39.01 53. 84 40.83 39.56 19 31. 95 3 8 .19 26.58 44.33 56.70 45.28 35.28 50.76 33.14 33 .91 20 41.64 31.23 35.12 39.01 36.30 39. 91 39.02 45.61 30.81 31.92 21 42.94 34. 92 33.81 61. 62 54.26 28.64 40.49 40.39 40.99 29.55 22 35.97 50.69 25.17 55.06 57.69 40.28 35. 91 40.42 30.98 29.01 23 36.49 36.54 43.62 40.50 53.46 44.64 34.21 38.68 31.26 35.40 24 29.85 42.49 41.01 66.22 43.90 33.07 37 .91 38.39 34.57 28.46 25 21. 63 41.52 44.35 56.88 56.70 42.23 35.53 51. 78 39.75 32.64 26 25.47 39.44 42.52 72.66 36.06 37.05 35.46 46.21 40.89 27.65 27 28.29 34.56 42.29 55.43 47.19 28.21 26.29 41.31 41.31 28.29 28 35.41 35.26 44.29 61. 79 29.15 25.83 22.25 37.05 30.40 29.33 29 35.48 31.55 42.31 48.80 33.79 26.74 26.89 34.60 26.54 28.44 30 40.96 36.31 42.26 51.83 39.28 33.04 32.01 40.26 32.84 28.97 31 37.45 38.12 35.98 58.59 39.57 36.44 29.66 41. 73 37.15 32.82 32 34.12 36.10 38.07 67.61 36.08 33.59 28.00 41.84 34.94 33. 77 33 38.42 38.48 36.35 58.76 32.56 28.31 23.14 46.35 39.50 34.09 34 38.72 44.06 38.87 66.64 31.51 30.60 29.52 48.73 41. 78 39.37 35 33.75 40.09 40.12 93.71 38.02 33.40 38.41 50.08 38.72 39.74 36 37.27 38.05 43.76 94.95 45.51 39.25 38.74 47.57 40.93 38.95 37 35.68 36.25 47.49 99.00 50.83 38.97 43.10 52.16 40.61 41.60 ------------------------------------------------------------------------

PAGE 151

140 Table K-1 -continued. ------------------------------------------------------------------------Bin II Cry Saraple II 11.1 12.1 12.2 13.1 13.2 14.1 14.2 15.1 15.2 16.1 ------------------------------------------------------------------------5 42.14 47.14 48.16 43.94 47.14 48.16 45.39 49.32 50.66 43.94 6 42.16 48.16 50.66 44.64 48.16 49.32 47.14 50.66 52.26 45.39 7 42.14 48.16 49.32 43.40 48.16 49.32 48.16 50.66 50.66 45.39 8 43.94 48.16 50.66 43.40 49.32 50.66 49.32 52.24 54.18 45.39 9 43.67 47.19 52.24 43.97 49.32 52.24 49.99 54.18 54.18 45.81 10 43.32 45.81 52.24 45.39 50.66 53.21 49.32 56.68 56.68 45.39 11 43.94 45.81 52.24 43.39 49.99 54.18 51.45 56.68 56.68 44.64 12 46.63 47.65 76.59 44.89 52.42 77 .84 75.62 79.60 79.60 48.03 13 44. 91 42.97 70.79 38.74 47.82 73.13 69.96 73.13 75.14 49.38 14 43.04 37.90 44.38 40.21 37.19 60.20 56.68 60.20 60.20 50.66 15 41.11 45.54 34.35 46.33 34.18 64.72 46.98 54.32 50.26 46.01 16 47.82 41.20 40.27 52.89 46.34 66.22 30.83 43.31 44.83 46.11 17 50.43 44.44 53.73 49.37 49.37 66.22 47.23 66.22 61.22 46.73 18 47.16 40 49.55 43.02 46.33 66.22 52.33 74.42 66.22 46.21 19 40.78 35.42 43.37 28.34 37.09 63.81 50.33 66.22 66.22 39.98 20 37.16 38.93 35.33 34.58 27.57 53.62 40.12 61.62 58.44 26.16 21 37.18 41.69 34.92 43.61 36.20 53.39 31.90 49.50 45.93 30.85 22 38.71 46.87 54.05 37.17 36.10 47.33 23.08 43.89 37.17 36.16 23 40.81 46.09 54.08 36.33 35.04 36. 91 42.20 51.50 61. 71 44.67 24 38.60 60.76 54.31 43.22 37.96 61. 70 51.54 49.67 46.59 36.69 25 37.71 82.61 80.81 45.65 48.13 54.60 46.00 34.68 42.27 30.06 26 34.44 99.00 66.47 52.29 33.78 54.38 45.87 45.86 48.66 22.27 27 32.09 71.91 57.44 49. 72 42.31 55.00 44.10 44.90 43.25 23.88 28 30.98 71.53 73.25 47.16 40.24 55.12 45.52 46.11 40.64 28.14 29 34.20 65.50 56.98 46.14 45.92 52.76 51.53 46.22 33. 77 30.23 30 37.82 63.85 65.43 48.80 50.28 66.48 49.40 46.67 41.70 33.16 31 36.14 71.61 57.92 50.87 46.37 65.44 47.60 44.93 41. 76 38.14 32 34.86 99.00 78.14 46.05 39.14 46.40 61.36 54.53 47. 77 38.41 33 38.88 99.00 83.03 44.97 43.63 64.70 66.28 62.17 61. 90 35.00 34 43.60 97.83 94.32 44.18 45.59 58. 77 55.38 60.21 59.63 40.52 35 49.13 86.40 99.00 46.26 49.59 50.40 46.76 58.75 61. 94 48.37 36 49.94 99.00 99.00 51.56 57.10 51.11 48.61 58.97 59.33 48.97 37 51.72 99.00 99.00 55.42 54.24 53.20 57.56 72.48 60.03 49.85 ----------------------------------------------------------------------

PAGE 152

141 Table K-1 -continued. -----------------------------------------------------------------------Bin II Cry Sample ii 16.2 17.1 17.2 18.1 18.2 19.1 19.2 20.1 20.2 21.1 -----------------------------------------------------------------------5 44.64 37 .92 43.30 45.39 46.22 42.14 48.16 42.14 42.70 44.64 6 45.39 38.99 43.30 46.22 48.16 43.94 49.32 42.14 42.70 41.61 7 44.64 38.26 42.14 46.22 47.14 43.94 48.16 41.12 39.78 40.65 8 47.14 38.62 43.30 46.22 44.64 45.39 49.32 43.30 39.78 43.94 9 47.65 37.76 43.30 43.67 45.22 47.14 50.66 43.97 41.88 46.68 10 46.22 39.00 42.14 43.00 45.08 47.14 48.74 39.99 43.00 47.65 11 43.32 37.62 36.08 39.06 43.30 46.68 42.30 42.78 41.61 48.74 12 47.19 37.05 27.76 41.37 39.80 49.41 40.91 48.44 43. 77 49.23 13 49. 77 32.43 32.75 45.72 35.39 45.17 38.76 52.36 45.72 44.17 14 49.99 33.67 35.47 41.24 38.78 55.43 25.85 53.21 42.64 23.51 15 49.22 40.79 40.23 42.09 46.04 64.72 49.26 39.93 31.22 32.40 16 40.09 42.19 41.30 47.63 46.00 64.22 56.37 38.76 26.60 40.06 17 42.39 40. 92 39.14 42.89 39.82 63.21 50.98 52.48 35.41 48.15 18 43.46 34.85 33.28 48.82 46.06 43.64 47.09 48 .50 40.93 47.94 19 38.56 30.60 36.13 45.50 40.13 43.07 39.87 47.85 38.64 42.27 20 56.09 26.22 32.74 44.26 46.94 51. 76 29.01 48 .70 33.65 24.91 21 38.65 28.30 30.56 42.17 42.31 46.36 54.63 36.36 29.83 37.59 22 33.15 25.09 32.45 47.16 44.83 39.44 59.55 38.33 29.03 30.57 23 43.55 30.55 30.01 47 .11 45.98 53.80 49.34 40.59 32.05 19.48 24 35.08 35.37 34.95 46.05 45.43 48.28 75.81 32.06 34.03 37.47 25 31.12 35.84 40.26 44.75 42.52 59.30 74.22 26.22 26.39 47. 77 26 22.64 33.20 35.23 38.02 31. 70 57.02 57.82 34.25 23.72 37.45 27 34.68 33.09 28.75 31.58 26.55 55.26 53.40 35.76 26.56 40.32 28 33.93 26.53 27.88 29.65 31.50 52.19 50.41 42.80 34.61 37.09 29 34.09 31.03 34.24 39.08 37.02 47.08 47.16 42.63 35.61 40.22 30 39.43 29.93 31.13 44.78 40.84 50.61 53.71 46.89 33.86 39.22 31 39.29 29.34 29.06 43.85 37.23 49.20 54.38 38.31 30.54 35.49 32 39.55 39.12 34. 77 43.07 43.49 52.20 59.42 36.73 27.32 36.75 33 42.56 38.90 38.70 48.57 58.64 62.87 84.34 44.53 32.34 40.53 34 48.49 33.53 43.83 54.24 66.01 75.59 97.83 46.10 36.17 40.84 35 53. 91 38.57 41.23 56.01 59.76 66.17 87.37 46.36 35.68 45.63 36 54.55 38.58 42.04 58.55 54.44 59.24 83.02 42.16 33.80 44.91 37 55.28 40.98 41. 75 59.45 55. 77 60.63 67.59 46.08 39.78 49.40 -----------------------------------------------------------------------

PAGE 153

142 Table K-1. -continued. -----------------------------------------------------------------------Bin If Cry Sample Ii 21. 2 22.1 22.2 23.1 23.2 24.1 24.2 25.1 25.2 26.1 -----------------------------------------------------------------------5 4 2.14 46.22 49.32 44.64 56.68 49.32 50.66 41.12 30. 8 1 40.20 6 4 4.64 47.14 52.24 45.39 56.6 8 50.66 52.24 42.70 32.97 42.70 7 44.64 47.14 50.66 43.94 56.68 50.66 52.24 43.30 39.37 42.14 8 46.22 49.32 52.24 44.64 56.68 52.24 52.24 44.64 42.14 43.30 9 48.16 49.99 54.18 46.27 58.44 52.24 52.24 45. 8 1 37.32 43.94 10 48.74 50.66 56.68 47.14 56.68 52.24 53.21 46.22 34.41 45.39 11 50.66 48.90 56.68 45.89 56.68 52.24 54.18 45.81 40.96 39.94 12 53.21 38.23 46.40 42.70 55.43 60.20 60.20 44.35 40.89 41.13 13 46.78 30.23 35.02 30.03 36.11 69.12 56.19 32.11 23.14 43.35 14 35.17 50.66 35.80 40.12 57.19 46.40 54.18 44.61 35. 92 33.75 15 24 60 57.82 64.72 49.44 63.21 47. 92 43.09 58.44 38.90 33.83 16 30.92 53.32 64.22 49.83 64.22 57.85 30.05 57.85 40.30 44.24 17 50.87 45.41 57.56 51.85 66.22 60.20 42.96 56.06 37.90 40.13 18 50.72 32.95 40.44 42.36 57.56 56.68 49.99 43.06 35.09 42.76 19 45.89 26.37 31.19 31.14 39.67 49.39 52.63 30.09 24.19 35.48 20 31.08 50.65 46.61 51.13 62.21 36.06 51.08 50.00 37.46 35.94 21 24.84 51.46 61.62 47 .13 62.63 39.57 29.13 47.65 38.95 39.63 22 26.09 41.39 56.56 33.46 42.66 50.29 24.88 35.58 23.74 44.78 23 31.07 59.45 68.79 44.76 58. 77 39.97 47.87 53.10 36.56 44.99 24 29.44 50.78 60.64 37.64 51.88 33.16 47.89 51.11 39.43 43.28 25 32.97 56 .11 47.16 34.74 52.30 33.27 25.44 53.80 36.19 43.19 26 30.38 47.64 55.31 28.47 40.13 23.83 31. 94 57.15 41.36 38 .11 27 35.21 57.79 59.47 26.08 42.25 34.62 46.05 50.27 25.71 33.21 28 40.80 52.09 76.14 29.35 38.56 35.35 37.71 40.61 23.45 33.40 29 43.48 56.09 64.91 38.62 50.15 38.02 46.65 47.96 32.43 38.86 30 39.74 62.83 62.59 39.90 55.98 44. 57 45.36 54.25 35.85 38.47 31 33.56 57.05 62.03 36.02 52.59 45.15 42.44 54.15 36.47 39.90 32 34.65 51.30 67.42 34.02 46.01 54.65 49.79 53.48 33.61 44.13 33 40.38 51.31 59.86 37.70 55.97 61.83 51. 70 49.59 31. 79 52.99 34 43.10 56. 77 57 .11 38.64 58.98 60.17 49.28 51.31 36.47 58.93 35 45.44 53.42 50.13 40.83 55.17 61.23 47.98 57.58 42.72 62.23 36 46.67 46.33 59.88 41.11 50.70 49.97 46.48 56.31 41.87 56.14 37 50.03 53.27 89.29 40.59 46.12 43.59 45.58 57.85 44.03 53.69 -----------------------------------------------------------------------

PAGE 154

143 Table K-1. -continued. -----------------------------------------------------------------------Bin It Cry Sample It 26.2 27.1 27.2 28.1 28.2 29.1 29.2 30.l 30.2 31.1 -----------------------------------------------------------------------5 4 8.16 48.16 52.24 44.63 44.64 45.39 48.16 41.61 47.14 49.32 6 50.66 50.66 54.18 45.39 45.39 46.22 48.16 42.14 41.61 50.66 7 50.66 49.32 54.18 45.39 46.22 46.22 47.14 41.61 39.78 50.66 8 50.66 52.24 56.68 47.14 45.39 47.14 49.32 42.14 42.70 50.66 9 52.24 42.42 60.20 46.27 43.94 44.46 50.66 42.14 44.29 52.24 10 51.45 39. 77 60.20 47.65 46.22 39.38 48.23 42.42 37.48 51.45 11 51.45 54.18 63.21 44.05 42.78 40.42 47.65 34.27 36.93 46.68 12 60.20 55.08 66.00 44.18 42.64 41. 66 53.67 33.04 34.41 40.89 13 49.99 41.28 66.00 37.70 35.73 40. 91 57.55 31.86 34.65 28.67 14 21.02 36.26 37.76 49.79 34.40 46.68 55.43 25.32 29. 72 44.76 15 46.82 34.87 35.87 53.70 40.65 45.41 41.05 34.66 39.11 58.44 16 59.03 62.21 50.35 53.54 48.93 45.03 34.71 43.83 43.18 53.84 17 53.21 66.22 57.82 49.99 46.68 47.71 42.38 40.58 40.04 49.15 18 50.49 42.58 63.21 44.64 42.79 52.24 39.27 37 .96 34.62 33.32 19 35.84 39.37 52.50 41.36 39.46 50.48 39.15 29.19 34. 8 7 26.18 20 23.55 35.26 37.90 48.75 34.83 44. 77 33.75 21. 99 30.21 43.76 21 44.94 40.03 47.67 45.58 43.00 42.35 25.79 25.92 30.26 38.86 22 57.33 45.29 57.18 39.22 35.82 34.37 25.82 30.44 31.79 29.39 23 51. 59 42.12 60.27 46.28 35.79 40.31 34.74 24.39 31. 76 48.18 24 80.12 44.80 63.34 40.99 32.37 39. 77 45.21 30.21 27.56 42.24 25 76.09 56.44 66.00 39.49 36.26 47.47 51. 75 32.80 37.49 37.02 26 53.95 44.17 65.28 34.54 27 .13 44.86 41.17 31.35 39.52 39.80 27 61.39 43.61 74.05 28.23 29.98 38.97 34.27 23.27 31.05 38.49 28 72.86 34.45 61.42 38.76 31.96 29. 96 48.20 25.76 28.25 41.03 29 64.21 37.95 84.66 44.55 38.13 40.10 47.54 30.49 36.02 45.07 30 64.59 45.60 99.00 43.31 38.86 40.71 39.38 32.75 37.63 43.11 31 76.11 49.55 99.00 37.51 33.22 46.46 41.05 29.91 39.51 37.99 32 84.57 44.42 99.00 36.03 32.60 49.31 46.86 26.97 40.22 41.03 33 88.51 52.52 99.00 40. 77 37.56 38.02 46.08 34.02 38.78 43.76 34 88.46 59.97 99.00 43.13 38.01 38.27 50.61 39.20 44.23 43.75 35 8 1. 28 64.23 99.00 42.30 36.25 48.65 48.43 38.49 44.33 45.09 36 52.28 63.59 99.00 36.60 32.85 47.94 44.71 34.50 44.98 42.38 37 55.95 65.74 99.00 39.65 37 .91 47.30 46.13 37.54 50.28 46.58 ------------------------------------------------------------------------

PAGE 155

144 Table K-1 -continued. -------------------------------------------------------------Bin Cry LEGEND II Sample II Bin Frequency 31.2 X SD II Range --------------------------------------------------------5 48.16 45.8 4.1 5 140-160 6 49.00 46.6 4.3 6 160-1 80 7 49.32 46.0 4.4 7 1 80 -200 8 52.24 47.1 4.6 8 200-220 9 52.24 46.9 6.2 9 220-260 10 52.24 47.2 6.0 10 260-300 11 51.45 46.9 6.5 11 300-340 12 39.30 50.7 13.2 12 340-380 13 35.36 44.6 13.1 13 380-440 14 56.68 41.8 9.8 14 440-480 15 60.20 45.2 10.2 15 480-560 16 60.20 47.6 10.6 16 560-620 17 56.60 50.2 8.9 17 620-700 18 33.97 46.6 9.1 18 700-780 19 40.74 40.5 9.4 19 780-880 20 54.26 39.6 9.8 20 880-1000 21 34.89 40.1 9.2 21 1000-1120 22 32.17 38.6 9.4 22 1120-1260 23 53.27 42.5 9.5 23 1260-1420 24 36.04 43.4 11.2 24 1420-1600 25 40.56 44.1 13.2 25 1600-1800 26 37.67 41.2 13.9 26 1800-2020 27 46.85 39.6 12.6 27 2020-2260 28 46.60 40.l 13.8 28 2260-2540 29 44.83 41. 7 11.4 29 2540-2860 30 45.38 44.2 12.0 30 2860-3200 31 42.00 43.6 13.0 31 3200-3600 32 47.20 45.0 16.6 32 3600-4040 33 49.34 48.4 17.6 33 4040-4540 34 48.52 50.8 17.9 34 4540-5100 35 45.24 51.5 15.8 35 5100-5720 36 41.60 51.3 15.5 36 5720-6440 37 45.80 53.8 16.0 37 6440-7220 --------------------------------------------------------------------------------------------------------------------------

PAGE 156

APPENDIX L PERCEPTUAL JUDGEMENTS OF CRY Table L-1. The author's perceptual judgements of the cries relative to four parameters of vocal quality:" roughness (i.e., noise level), vocal fry, frequency variation, or octave shifts (i.e., voice breaks). PERCEPTUAL JUDGEMENTS Freq. Oct. Freq. Oct. s 1.1 2.1 3.1 4.1 5.1 6.1 7.1 8.1 9.1 Rough* Fry@ Var. Shift (0,1,2) (Y,N) (Y,N) (Y,N) s 1.2 2.2 3.2 4.2 5.2 6.2 7.2 8.2 9.2 Rough* Fry@ Var. Shift (0,1,2) (Y,N) (Y,N) (Y,N) 10.1 11.1 12.1 13.1 14.1 15.1 16.1 17.1 18.1 19 1 20 1 21.1 22 1 23 1 24.1 25.1 26.1 27.1 28.1 29.1 30.1 31.1 0 2 2 1 1 0 0 1 1 0 1 0 1 0 1 1 1 2 1 1 1 0 1 0 0 1 1 0 1 0 0 y y N N y y N N y y y y N N N N y y y y N N N N N y N y y y y y N N N N y N N y y y y y N N y N N N N N y N y N N N N N N N N N N y N N N N N y y N N N N N N N N N N y N N N N N N N N N 10.2 12.2 13. 2 14.2 15.2 16.2 17.2 18.2 19.2 20.2 21.2 22.2 23.2 24.2 25.2 26.2 27.2 28.2 29.2 30.2 31.2 Classi f icaton of ROUGH values are as follows: 0 0 0 2 1 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 0 1 0 1 1 0 0 1 2 0 y y N N N N N N y N y N N N N N y N N N N N N N y N N y y N y N N N N N N N N N N N y N N N N N N N N N N N N N N y N N 0 clear phonation or very little noise durin g phonation 1 -moderate or intermittant noise durin g phonation 2 -extremely noisy phonation N N N N N N N N N N N N y N N N N N N N N N N N N N N y N N @Fry will is e x hibited for limited portions of each desi g nated cr y 145

PAGE 157

Table L-2. Individual and mean perceptual responses to "strain .. on each of the cry samples Listeners used a scale from 1 to 5, with 1 meaning least and 5 meaning most strain. ------------------------------------------------------------------------------------------------------------------------Cry Listener Cry Listener ft 1 2 3 4 5 x SD ft 1 2 3 4 5 x SD x.1-x.2 ------------------------------------------------------------1.1 4 5 4 4 4 4.2 0.4 1.2 3 3 4 3 3 3.2 0.4 1.0 2.1 5 5 5 5 5 5.0 0.0 2.2 2 2 2 1 2 1. 8 0.4 3.2 3.1 3 4 2 2 3 2.8 0.8 3.2 2 1 1 2 2 1.6 0.5 1.2 4.1 4 5 4 4 5 4.4 0.5 4.2 5 2 3 5 4 3.8 1.3 0.6 5.1 5 5 5 5 5 5.0 o.o 5.2 3 4 4 5 2 3.6 1.1 1.4 6.1 2 4 4 2 4 3.2 1.1 6.2 1 1 1 1 1 1.0 o.o 2.2 7.1 1 2 2 2 3 2.0 0.7 7.2 1 1 1 1 1 1.0 0.0 1.0 8.1 4 5 5 5 4 4.6 0.5 8.2 2 1 5 2 4 2.8 1.6 1.8 9.1 5 5 5 4 5 4.8 0.4 9.2 3 2 4 2 3 2.8 0.8 2.0 10.1 1 1 3 2 2 1.8 0.8 10.2 4 3 4 3 3 3.4 0.5 -1.6 11.1 4 4 5 4 5 4.4 0.5 --12.1 3 3 2 2 5 3.0 1.2 12.2 1 1 1 2 2 1.4 0.5 1.6 13.1 2 2 3 3 4 2.8 0.8 13.2 1 1 1 2 3 1. 6 0.9 1.2 14.1 2 1 3 3 2 2.2 0.8 14.2 2 1 2 3 3 2.2 0.8 0.0 15.1 5 3 5 5 5 4.6 0.9 15.2 5 3 5 5 4 4.4 0.9 0.2 16.1 3 1 2 4 3 2.6 1.1 16.2 2 1 2 2 3 2.0 0.7 0.6 17.1 5 3 5 4 5 4.4 0.9 17.2 3 1 2 2 2 2.0 0.7 2.4 18.1 5 4 5 4 5 4.6 0.5 18.2 4 2 2 3 4 3.0 1.0 1. 6 19.1 5 5 5 4 5 4.8 0.4 19.2 2 1 1 1 2 1.4 0.5 3.4 20.1 5 4 5 4 4 4.4 0.5 20.2 4 1 4 4 3 3.2 1.3 1.2 21.1 3 3 3 2 3 2.8 0.4 21.2 2 1 2 2 2 1.8 0.4 1.0 22.1 1 2 3 3 2 2.2 0.8 22.2 1 1 1 1 2 1.2 0.4 1.0 23.1 4 4 5 4 4 4.2 0.4 23.2 2 1 2 4 3 2.4 1.1 1. 8 24.1 3 2 3 4 3 3.0 0.7 24.2 1 1 1 1 2 1.2 0.4 1.8 25.1 5 5 5 5 5 5.0 0.0 25.2 4 2 4 2 3 3.0 1.0 2.0 26.1 5 3 4 5 5 4.4 0.9 26.2 2 3 2 3 4 2.8 0.8 1.6 27.1 4 3 3 3 3 3.2 0.4 27.2 1 1 1 2 1 1.2 0.4 2.0 28.1 5 4 5 4 5 4.6 0.5 28.2 4 1 2 2 4 2.6 1.3 2.0 29.1 4 3 1 2 3 2.6 1.1 29.2 2 2 2 5 3 2.8 1.3 -0.2 30.1 3 1 3 3 2 2.4 0.9 30.2 5 3 2 1 3 2.8 2.5 -0.4 31.1 1 1 1 2 3 1. 6 0.9 31.2 1 1 4 4 3 2.6 1.5 -1.0 --------------------------------------------------------X 1.2 SD 1.1 ------------------------------------------------------------------------------------------------------------------------

PAGE 158

147 Table L-3. Individual and mean perceptual responses to "roughness on each of the cry samples. Listeners used a scale from 1 tu 5, with 1 meaning least and 5 meaning most roughness. ------------------------------------------------------------------------------------------------------------------------Cry Listener Cry Listener ; 1 2 3 4 5 x SD I I 1 2 3 4 5 x SD X:.1-"X.2 ------------------------------------------------------------1.1 1 4 3 1 4 2.6 1.5 1.2 3 3 3 1 3 2.6 0.9 0.0 2.1 5 5 5 4 5 4.8 0.4 2.2 3 1 4 2 2 2.4 1.1 2.4 3.1 5 3 5 3 4 4.0 1.0 3.2 2 1 2 3 2 2.0 0.7 2.0 4.1 3 4 5 2 4 3.6 1.1 4.2 5 5 5 3 4 4.4 0.9 -0.8 5.1 4 5 5 3 5 4.4 0.9 5.2 3 1 3 2 3 2.4 0.9 2.0 6.1 2 3 2 4 4 3.0 1.0 6.2 1 1 1 1 1 1.0 0.0 2.0 7.1 2 3 1 1 3 2.0 1.0 7.2 1 2 1 1 1 1.2 0.4 0.8 8.1 3 3 4 4 4 3.6 0.5 8.2 4 4 4 3 4 3.8 0.4 -0.2 9.1 3 5 5 4 5 4.4 0.9 9.2 2 3 4 5 3 3.4 1.1 1.0 10.1 2 2 2 1 2 1.8 0.4 10.2 3 4 5 5 3 4.0 1.0 -2.2 11.1 3 4 5 3 4 3.8 0.8 --12.1 4 5 5 5 4 4.6 0.5 12.2 3 2 2 2 3 2.4 0.5 2.2 13.1 2 4 3 3 3 3.0 0.7 13.2 2 2 2 2 2 2.0 0.0 1.0 14.1 1 1 1 2 2 1.4 0.5 14.2 4 4 3 3 3 3.4 0.5 -2.0 15.1 3 4 4 3 4 3.6 0.5 15.2 2 3 2 3 4 2.8 0.8 0.8 16.1 4 3 4 4 4 3.8 0.4 16.2 4 3 4 4 4 3.8 0.4 0.0 17.1 5 3 4 4 4 4.0 0.7 17.2 4 3 4 4 3 3.6 0.5 0.4 18.1 5 4 5 5 5 4.8 0.4 18.2 4 4 4 2 4 3.6 0.9 1.2 19.1 4 3 4 3 5 3.8 0.8 19.2 1 1 2 1 2 1.4 0.5 2.4 20.1 4 3 5 1 4 3.4 1.5 20.2 4 4 5 2 3 3.6 1.1 -0.2 21.1 3 4 3 2 2 2.8 0.8 21.2 2 4 2 2 3 2.6 0.9 0.2 22.1 2 2 2 1 3 2.0 0.7 22.2 1 3 1 3 3 2.2 1.1 -0.2 23.1 4 4 5 4 4 4.2 0.4 23.2 2 3 2 4 3 2.8 0.8 1.4 24.1 2 3 2 2 4 2.6 0.9 24.2 1 2 1 2 3 1.8 0.8 0.8 25.1 3 4 4 1 4 3.2 1.3 25.2 3 4 5 4 4 4.0 0.7 -0.8 26.1 5 4 5 4 5 4.6 0.5 26.2 3 4 3 2 3 3.0 0.7 1.6 27.1 2 3 2 3 4 2.8 0.8 27.2 2 2 1 1 1 1.4 0.5 1.4 28.1 3 5 4 2 5 3.8 1.3 28.2 4 4 3 4 3 3.6 0.5 0.2 29.1 3 4 2 4 3 3.2 0.8 29.2 4 5 5 5 4 4.6 0.5 -1.4 30.1 3 3 3 3 2 2.8 0.4 30.2 4 4 4 5 4 4.2 0.4 -1.4 31.1 1 4 1 2 3 2.2 1.3 31.2 1 4 1 3 3 2.4 1.3 -0.2 --------------------------------------------------------I 0.5 SD 1.3 ------------------------------------------------------------------------------------------------------------------------

PAGE 159

REFERENCES Aldrich, C.A., Sung, C., and Knop, C. (1945) The cr y ing of newly born babies, J. Ped., 26:313-326. Als, H., Tronick, E., Lester, B.M., and Brazelton, T.B. (1977) The Brazelton neonatal behavioral assessment scale (BNBAS), J. Abnormal Child Psych., 5:215-231. Anderson, G.C. (1978) Relation of sucking from birth to oxygenation and clinical course in premature infants: A theoretical framework, Unpublished manuscript. Anderson, G.C. (1984) Crying in transitional newborn infants: physiology and developmental implications, unpublished manuscript adapted from papers presented at the School of Nursing, Case Western Reserve University, October 2, 1983 and School of Nursing, University of California, San Francisco, February 6, 1984. Aronson, A.E. (1980) Clinical Voice Disorders, Thieme-Stratton, Inc., New York. Bell, S.M., and Ainsworth, M.D.S. (1972) Infant Crying and maternal responsiveness, Child Dev., 43:1171-1190. Blenick, G., Tavolga, W., and Antopol, W. (1971) Variations in the birth cries of newborn infants from narcotic addicted and normal mothers, J. Obstet. Gynaec., 110: 948-958. Bosroa, J.F. (1975) Anatomic and physiologic development of the speech apparatus, In D.B. Tower (Ed.) The Nervous System, Vol. 3: Human Communication and Its Disorders, Raven Press, New York. Bosma, J.F., Truby, H.M., and Lind, J (1966) Cry motions of the newborn infant, Acta Paediat. Scand., 49, Suppl. 163:61-92. Brazelton, T.B. (1976) Neonatal Behavioral Assessment Scale, Clinics in Developmental Medicine, No. 50, SIMP with Heinemann Medical, London. Brazelton, T.B. (1981) Behavioral competence of the newborn infant, Chapter 18 in G.B. Avery (Ed.) Neonatology: Pathophysiology and Management of the Newborn, J.B. Lippincott Company, Philadelphia. Brodnitz, F.S. (1959) Vocal Rehabilitation, Whiting Press, Rochester, Minn. Casaer, P., and Akiyama, Y. (1970) The estimation of the postmenstrual age: A comprehensive review, Dev. Med. Child. Neural., 12:697. 148

PAGE 160

149 Caldwell, H.S., and Leeper, H.A., Jr. (1974) Temporal patterns of neonatal vocalizations: a normative investigation, Percept. Motor Skills, 38:911-916. Chiba, T., and Kajiyama, M. (1958) The Vowel, Its Nature and Structure, Phonetic Society of Japan, Tokyo. Coleman, R.F. (1969) Effect of median frequency levels upon the roughness of jittered stimuli, J. Speech Hear. Res., 12:330-336. Colton, R., and Steinschneider, A. (1980) Acoustic characteristics of first week infant cries: some relationships to the Sudden Infant Death Syndrome. Chapter 8 in T. Murry and J. Murry (Eds.) Infant Communication: Cry and Early Speech, College Hill, Houston. Colton, R., and Steinschneider (1981) The cry characteristics of an infant who died of SIDS, J. Sp. Hear. Dis., 46:359-363. Colton, R., Steinschneider, A., Black, L., and Gleason, J. (1985) The newborn infant cry: its potential implications for development and SIDS, In B. Lester and C. Boukydis, (Eds.) Infant Crying: Theoretical and Research Perspectives, Plenum Publishing Corporation, Boston. Crystal, D. (1973) Nonsegmental phonology in language acquisition: a review of the issues, Lingua, 32:1-45. Darwin, C. (1965) The expression of emotions in man and animals, University of Chicago Press, Chicago, Originally published 1872. Doherty, E.T. (1975) An Evaluation of Selected Acoustic Parameters for Use in Speaker Identification, Ph.D. Dissertation, University of Florida. Dubowitz, L., and Dubowitz, V. (1981) The Neurological Assessment of the Preterm and Full Term New Born Infant, Clinics in Developmental Medicine, No. 79, SIMP with Heinemann, London. Fairbanks, G. (1942) An acoustical study of the pitch of infant hunger wa i ls, Child Dev., 13:227-232. Fant, G., Acoustic Theory of Speech Production, Mouton, Hague, 1970. Fisichelli, V.R., and Karelitz, S. (1963) The cry latencies of normal infants and those with brain damage, J. Ped., 62:724. Fisichelli, V.R., Coxe, M., Rosenfeld, R., Haber, A., Davis, J., and Karelitz, S. (1966) The phonetic content of the cries of normal infants and those with brain damage, J. Psych., 64:119. Flatau, T.S., and Gutzmann, H. (1906) Die stimme des sanglings, Ach. Laryng. Rhino!., 18:139-151.

PAGE 161

150 Formby, D. (1967) Maternal recognition of infant's cry, Dev. Med. Child Neural., 9:293-298. Freudenberg, R.P., Driscoll, J.W., and Stern, G.S. (1978) Reactions of adult humans to cries of normal and abnormal infants, Infant Beh. Dev., 1:224-227. Frodi, A.M., Lamb, M.E., Leavitt, L.A., Donovan, W.L, Neff, C., and Sherry, D. (1978) Fathers' and mothers' responses to the faces and cries of normal and premature infants. Dev. Psych., 14:490-498. Frokjaer-Jensen, B., and Prytz, S. (1974) Evaluation of speech disorders by means of long-time-average-spectra., ARIPUC, 8:227-237. Gardiner, W. (1838) The Music of Nature, Wilkins and Carter, Boston. Gardosik, T.A., Ross, P.J., and Singh, S. (1980) Acoustic characteristics of the first cries of infants, Chapter 5 in T. Murry and J. Murry (Eds.) Infant Communication: Cry and Early Speech, College Hill, Boston. Gay, T., Hirose, H., Strome, M., and Sawashima, M. (1972) Electromyography of the intrinsic laryngeal muscles during phonation, Ann. Otol., 81:401-409. Golub, H. (1980) A physioacoustic model of the infant cry and its use for medical diagnosis and prognosis, Ph.D. Dissertation, Massachusetts Institute of Technology. Hecker, M.H.L., and Kreul, E.J. (1971) Descriptions of the speech of patients with cancer of the vocal folds, part 1: measurements of fundamental frequency, J. Acoust. Soc. Amer., 49:1275-1282. Hollien, H. (1980) Developmental aspects of neonatal vocalizations, Chapter 2 in T. Hurry and J. Murry (Eds.) Infant Cornunication: Cry and Early Speech, College Hill, Houston. Hollien, H. (1981) Analog instrumentation for acoustic speech analysis, Chapter 4 in J. Darby (Ed.) Speech Evaluation in Psychiatry, Grune and Stratton, New York, 79-103. Hollien, H., and Hicks, J.W., Jr. (1979) Mechanisms for the control of vocal frequency, in J.J. Wolf and D.R. Klatt (Eds.) Speech Communication Papers, ASA-50, Acoustical Society of America, New York, 97-100. Hollien, H., Michel, J., and Doherty, E.T. (1973) A method for analyzing vocal jitter in sustained phonation, J. Phonetics, 1:85-91. Hopkin, G.B. (1967) Neonatal and adult tongue dimensions, Angle Or t hodont., 27:132-133. Horii, Y. (1979) Fundamental frequency perturbation observed in sustained phonation, J. Speech Hear. Res., 22:5-19.

PAGE 162

151 Horii, Y. (1981) Perturbation characteristics of the voice, Paper presented at the Tenth symposium of the Care of the Professional Voice, Voice Foundation, New York, June, 1981. Illingworth, R.S. (1955) Crying in infants and children, Brit. Med. J., 1:75-78. Karelitz, S., and Fisichelli, V.R. (1962) The cry thresholds of normal infants and those with brain damage, J. Ped., 61:679-685. Karlberg, P. (1960) The adaptive changes in the immediate postnatal period with particular reference to respiration, J. Ped., 56:585-604. Kittel, G., and Hecht (1977) Der erste schrei-frequenzana-lytische untersuchunger, Sprache-Stimme-Gehor, 1:151-155. Klaus, M.H., Tooley, W.R., Weaver, K.H., and Clements, J.A. (1963) Lung volume in the newborn infant, Pediatrics, 30:111-116. Lenneberg, E.H. (1967) Biological Foundations of Language, Wiley and Sons, New York. Lester, B.M. (1976) Spectrum analysis of the cry sounds of well nourished and malnourished infants, Child Dev., 47:237-241. Lester, B.M. (1978) The organization of crying in the neonate, J. Ped. Psych., 3:3,122-30. Lester, B.M., and Zeskind, P.S. (1978) Brazelton scale and physical size correlates of neonatal cry features, Infant Beh. Dev., 1:393-402. Lester, B.M., and Zeskind, P.S. (1982) A biobehavioral perspective on crying in early infancy, in Fitzgerald, Lester and Yogman (Eds.) Theory and Research in Behavioral Pediatrics, Vol. 1, Plenum Publishing Corporation, Boston. Lieberman, P. (1963) Some acoustic measures of the fundamental periodicity of normal and pathological larynges, J. Acoust. Soc. Amer., 35:344-353. Lieberman, P., Harris, K.S., Wolff, P., and Russell, I.H. (1971) Newborn infant cry and nonhuman primate vocalization, J. Speech. Hear. Res., 14:718-727. Lind, J. (1965) Newborn infant cry, Acta Paed. Stockh., suppl. 163,. Lind, J., Vuorenkoski, B., Rosberg, G., Partanen, T., and Wasz-Hockert, 0.(1970) Spectro g raphic analysis of vocal response to pain sti m uli in infants with Down's syndrome, Dev. M ed. Child Neural., 12:47 8 -4 8 6.

PAGE 163

152 Lind, J., Wasz-Hockert, O., Vuorenkoski, V., and Valanne, E. (1965) The vocalization of a newborn, brain damaged child, Annls. Paediat. Fenn., 11:32-37. Long, E.C., and Hull, W.E. (1961) Respiratory volume flow in the crying newborn infant, Pediatrics, 373-377. Lubchenco, L.O. (1976) The High Risk Infants, Saunders, Philadelphia. Lubchenco, L.O., Hansman, C., and Boyd, E. (1966) Intrauterine growth in length and head circumference as estimated from live births at gestational ages from 26-42 weeks, Pediatrics, 37:403. Massengill, R.M. (1969) Cry characteristics in cleft-palate neonates, J. Acoust. Soc. Amer., 45:782. Michel, J. (1961) A pilot study of the pitch characteristics of hun g er cries in neonates, In NIH Progress Report B-2162, University of Wichita, 1-2. Hichelsson, K. (1971) Cry analyses of symptomless low birth weight neonates and of asphyxiated newborn infartts, Folia Phoniat., 28:161-173. Michelsson, K. (1980) Cry characteristics in sound spectrographic analysis, Chapter 4 in T. Murry and J. Murry (Eds.), Infant Counication: Cry and Early Speech, College Hill Press, Houston. Michelsson, K., and Sirvio, P. (1975) Cry analysis in herpes encephalitis, Proc., 5th Scand. Cong. Perinat. Med. Michelsson, K., Sirvio, P., Koivisto, M., Sovijarvi, A and Wasz-Hockert, 0.(1975) Spectrographic analysis of pain cry in neonates with cleft palate, Biol. Neon., 26:353-358. Michelsson, K., Sirvio, P., and Wasz-Hockert, O. (1977) Pain cry in full terra asphyxiated newborn infants correlated with late findings, Acta Paed. Scand., 66:611-616. Miller, H.C., and Hassanein, K. (1971) Diagnosis of impaired fetal growth in newborn infants, Pediatrics, 48:511-522. Morsbach, G., and Bunting, C. (1979) Maternal recognition of their neonates' cries, Dev. Med and Child Neurol., 21, 178-185. Muller, E., Hollien, H., and Murry, T. (1974) Perceptual responses to infant cryin g : identification of cry types, J. Child Lang., 1:89-95 Murry, T., and Doherty, E.T. (1977) Frequency perturbation and duration characteristics of pathological and normal speakers, J. Acoust. Soc. Amer., 62(Sl):S5(A).

PAGE 164

153 Hurry, T., Amundson, P., and Hollien. H. (1975) Acoustical characteristics of infant cries: fundamental frequency, J. Child Lang., 4:321-328. Nakazima, S. (1980) The reorganization process of babbling, Chapter 12 in T. M urry and J. Murry (Eds.) Infant Communication: Cry and Early Speech, College Hill Press, Houston. Noback, G.J. (1923) The developmental topography of the larynx, trachea and lungs in the fetus, newborn infant and child, Am. J. Dis. Child., 26:515-533. Osgood, L.E. (1953) Method and Theory in Experimental Psychology, Oxford University Press, New York. Ostwald, P.F. (1963) The Acoustic Communication of Emotion, Charles C. Thomas, Springfield, Illinois. Ostwald, P.F. (1972) The sounds of infancy, Dev. Med. Child Neural., 14: 350-361. Ostwald, P.F., and Peltzman, P. (1974) The cry of the human infant, Scientific American, 230:84-90. Ostwald, P.F., Peltzman, P., Greenberg, M., Meyer, J. (1970) Cries of a trisomy 13-15 infant, Dev. Med. Child Neurol., 12:472. Ounsted, M., and Ounsted, C. (1973) On fetal growth rate, Clinics in Developmental Medicine, No. 46, Spastics International H edical Publications, Lippincott, Philadelphia. Parmalee, A.H. (1962) Infant crying and neurologic dia g nosis, J. Ped., 61:801-802. Ploog, D. (1981) Neurobiology of primate audio-vocal behavior, Brain Research Reviews, 3:35-61. Prechtl, H.F.R. (1968) Neurological findings in newborn infants after preand perinatal complications, In J. Jonix, H. Visser and J. Troelstra (Eds.), Aspects of Prematurity and Dysmaturity, Stenfert Kroese, Leiden. Prechtl, H.F.R. (1977) The neurological examination of the full-term newborn infant, Clinics in Developmental Medicine, No. 63, Spastics International M edical Publications, Philadelphia. Prescott, R. (1975) Infant cr y sound: developmental features, J. Acoust.Soc. Amer., 57 : 1186-1191. R ingel, R.L, and Kluppel, D.D. (1964) N eonatal cryin g : a n o rmative study, Folia Phoniat., 16:1-9. Ruja, H. (1948) The relation between neonate crying and length of la b or, The Journal of Genetic Psychology, 73:53-55.

PAGE 165

154 Sagi, A. (1981) Mothers' and Non-Mothers' Identification of Infant Cries, Infant Beh. Dev., 4:37-40. Sheppard, W.C., and Lane, H.L (1968) Development of the prosodic features of infant vocalizing, J. Speech Hear. Res., 11:94-108. Sherman, M. (1927) The differentiation of emotional responses in infants, II:The ability of observers to judge the emotional characteristics of the cryingof infants, and of the voice of an adult, J. Comp. Psych., 7:5, 265-284. Shipp, T., and McGlone, R.E. (1971) Larngeal dynamics associated with voice frequency change, J. Speech Hear. Res., 14:761-768. Sirvio, P., and Michelsson, K. (1976) Sound spectrographic analysis of normal and abnormal newborn infants, Folia Phoniat., 28:161-173. Stark, R.E., and Nathanson, S.N. (1975) Unusual cry features of cry in an infant dying suddenly and une x pectedly, In J.F. Bosma and J. Showacre (Eds.), Development of Upper Respiratory Anatomy and Function: Implication for SIDS (DHEW Publication No (NIH) 75-941), U.S. Government Printing Office, Washington. Stark, R.E., Rose, S.N., and McLagen, M. (1975) Features of infant sounds the first eight weeks of life, J. Child Lang., 2 : 205-221. Stevens, K.N., and House, A.S. (1961) An acoustical theory of vowel production and some of its implications, J. Speech Hear. Res., 4:303-320. Stratton, P. (1982) Significance of the psychobiology of the human newborn, Chapter 1 in P. Stratton (Ed.) Psychobiology of the Human Newborn, John Wiley and Sons, Ltd., New York. Stuart, N., and Meredith, R. (1930) Table of pediatric heights and weights, Cited in H.K. Silver, C.H. Kempe, and H.B. Bruyn (Eds.) Handbook of Pediatrics, Lange Medical Publications, Los Altos, California. Tenold, J.L, Crowell, D.H., Jones, R.H., Daniel, T.H., McPherson, D.F., and Popper, A.N. (1974) Cepstral and stationarity analysis of full term and premature infant cries, J. Acoust. Soc. Amer., 56:975-980. Thad e n, C. and Koivisto, M. (19 8 0) Acoustic anal y sis of t h e nor m al pain cry, Chapter 6 in T. Murr y and J. Hurry (Eds. ) Infant Cor;nnunication: Cry and Early Speech, Colle g e Hill Press, Houston. Thollian E.B. (1975) Early development of sleeping behavior in infants, In N.R. Ellis (ed.) Abberrant Development in Infancy, 123-138, New York, John Wiley and Sons, Inc. Truby, H.M., and Lind, J. (1965) Cry sounds of the newborn infant, In J. Lind (Ed.) Newborn Infant Cry, Almqvist and Wiksells, Uppsala.

PAGE 166

155 Vaughn, V.C., III, and McKay, R.J. (Eds.) (1975) Nelson Textbook of Pediatrics, 10th ed., Saunders, Philadelphia. Vaughn, B., and Sroufe, L.A. (1979) The temporal relationship between infant heart rate acceleration and crying. Child Dev., 50:565-567. Vuorenkoski, V., Lind, J. Partanen, T.J., Lejeune, J. Lefourcade, J., and Wasz-Hockert, O. (1966) Spectrographic analysis of cries from children with maladie due cri-du-chat, Annls. Paediat. Fenn., 12 : 174. Vuorenkoski, V., Wasz-Hockert, 0., Lind, J., Koivisto, M., and Partanen, T. J (1971) Training the auditory perception of some specific types of the abnormal pain cry in newborn and young infants, STL-QPSR, 4/1971:37-48. Vyas, H., Milner, A.D., and Hopkin, I.E. (1981) Intrathoracic pressure and volume changes during onset of respiration, J. Ped., 99:787-791. Wasz-Hockert, O., Koivisto, M., Vuorenkoski, V., Partanen, T., and Lind, J. (1971) Spectrographic analysis of pain cry in hyperbilirubinemia, Biol.Neon., 17:260-271. Wasz-Hockert, O., Lind, J., Vuorenkoski, V., Partanen, T.J., and Valanne, E. (1968) The Infant Cry: A Spectrographic and Auditory Analysis, Heinemann, London. Wasz-Hockert, 0., Partanen, T.J., Vuorenkoski, V., Valannne, E., and Michelsson, K. (1964) The identification of some specific meanings in newborn and infant vocalization, Experioentia, 20:154. Weisenfeld, A.R., Malatesta, C.Z., and DeLoach, L.L. (1981) Differential parental response to familiar and unfamiliar infant distress signals, Infant Beh. Dev., 4:281-295. Wendahl, R. (1963) Laryngeal analog synthesis of harsh voice quality, Folia Phoniat., 15:251. Wendahl, R. (1966a) Laryngeal analog synthesis of jitter and shimmer: Auditory parameters of harshness, Folia Phoniat., 18:98-108. Wendahl, R. (1966b) Some parameters of auditory roughness, Folia Phoniat., 18:26-32. Wendler, J., Doherty, E.T., and Hollien, H. (1980) Voice classification by means of long-term spectra, Folia Phoniat., 32:51-60. Wilcox, K. (1978) Age and vowel differences in vocal jitter, Unpublished Master's Thesis, Purdue University. Wolff, P. (1969) The natural history of crying and other vocalizations in early infancy, in B.M. Foss (Ed.) Determinants of Infant Behavior, Vol. IV, Methuen, London.

PAGE 167

156 Woodson, R., Reader, F. Shepherd, J., and Chamberlain, G. (1981) Blood pH and crying in the newborn infant, Infant Beh. and Dev., 4:41-45. Zemlin, W.R. (1981) Speech and Hearing Science, Prentice-Hall, Englewood Cliffs, N.J. Zeskind, P.S. (1981) Behavioral di m ensions and cry sounds of infants of differential fetal growth, Infant Beh. Dev., 4:297-305. Zeskind, P.S., and Lester, B.M. (1978) Acoustic features and auditory perceptions of the cries of newborns with prenatal and perinatal complications, Child Dev., 49:580-589. Zeskind, P.S., and Lester, B.M. ( 1981) Analysis of cry features in newborns with different fetal growth, Child Dev., 5 2 :207-212.

PAGE 168

BIOGRAPHICAL SKETCH Brian Ross Klepper was born in 1952 in Jacksonville, Florida. He received in B. A. in literature at the University of Florida in 1974, after which he taught high school and worked in the family business. In 1977, he returned to graduate school where he has worked toward the doctorate in Speech with a specialization in the phonetic sciences. During his academic training, he has been involved in research on underwater hearing and communication in divers, the development of an underwater acoustic retrieval system, vocal indicators of psychological stress, speaker identification, neuropsychological approaches to syntactic processing and most recently, neonatal cry. In addition to his academic pursuits, he has coestablished a consulting firm, Forensic Communication Associates, which provides analysis and expert witness services to the legal community on tape recorded evidence and human communication issues. Mr. Klepper is married and has one child. 157

PAGE 169

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and i s fully adequate, in scope and quality, as a dissertation for the de g ree of Doctor of Philosophy. Professor of Speech ) 1An 1/'nv I certify that I have read this study and t hat in my opinion it confor m s to acceptable standards of scholarl y presentation and is fully adequate, in scope and quality, as a dissertation f or t h e d eg ree of Doctor of Philosophy. Professor of Speech I certify that I have read this study and that in m y opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the de g ree of Doctor of Philosophy. Howard B. Rothman Associate Professor of Speech I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the de g ree of Doctor of Philosophy. /, \ ~-=----,.:...=-.,. W : Keit h Ber g A ssociat e Professor of Psycholo gy

PAGE 170

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Gene Cranston Anderson Professor of N ursin g This dissertation was submitted to the Graduate Faculty of the Department of Speech in the College of Liberal Arts and Sciences and to the Graduate School and was accepted as partial fulfillment of the requirements for this degree of Doctor of Philosophy. May, 1985 Dean, Graduate School


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E5QJNKBQX_TSDYNR INGEST_TIME 2018-06-22T18:02:07Z PACKAGE AA00062663_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES