THE EFFECTIVENESS OF THE ACOUSTIC MEASURE A1 P n IN PREDICTING PERCEPTUAL JUDGMENTS OF HYPERNASALITY By NOUR EL BASHITI A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2014
Â© 2014 Nour El Bashiti
To my loving family
4 ACKNOWLEDGMENTS A doctorate is a long and emotional journey; mine has been no exception. M any people have helped me over these years in numerous ways that it is impossible to remember everyone. Therefore, I would like to thank anyone I might forget to mention . I would like to express my deepest appreciation for my doctoral committee for their patience and support. I would like to thank Dr. William Will iams for always believing in my abilities and for all the constructive criticism that led me through this journey. Also, I would like to thank my mentor Dr. Rahul Shrivastav, for giving me the opportunity to work with him, and for giving me the freedom in choosing my research topic. In spite of the short time I have spent in his lab, he opened my eyes to a new world of accomplishments. I would also like to thank Dr. Ratree Wayland and Dr. Kenneth Logan for their continuous support and guidance. Each one of you has added a pillar in building a strong foundation for me to be able to go through my PhD years . I take this opportunity to record my sincere thanks to Ms. Debra Anderson, my international advisor, the supporting sister, the second mother, and most trusted friend. I would have never been able to reach the summit without your guidance and passionate help. Thank you for all the support you provided me. If I possess any strengt because you were there to inspire me to keep pursuing my dreams. I extend my sense of gratitude to my friends who helped me through many hard times. Sanal Buvaev, Chandan Challana, and Rohit Kumar , you kept my heart at peace and my mind at ease. I owe a lot to each one of you. You gave me many memories to cherish till the day I leave this earth.
5 I would like to also thank the dear friends I made in the Speech, Language and Hearing Sciences department, Supraja, my best friend and my rock, the keeper of my secrets and my trusted source of advice. Moreover, I would like to thank Jonathan, Jimena, Jinyi, Heyjin, Audry, Lisa, Dan, Jessica and Issac for their wonderful friendship. In addition, many thanks to my lab mates, Traci, Tony, Yukun, Angela and Sav ya. You made my experience in the Speech Acoustics laboratory a memorable experience to treasure forever. I will be leaving University of Florida with so many happy moments and so many laughs when thinking of you. The completion of this dissertation would not have been possible without the help of Dr. Murad Qahwash and Dr. Mark Skowronski, Dr. Yousef Haddad, Idella King and Cassie Mobley . Thank you for all the help and support. Additionally, I would like to extend my thanks and appreciation to Operation Sm ile, Dr. John Riski, and Ms. Virginia Dixon Wood. Last but not least, I dedicate this accomplishment to my parents, the jewel of my heart, my brothers, Ziad, Abdul Aziz, and Ahmad, and my sisters, Nahla and Randa. My lovely nephews, Abdullah and Yusuf, th anks for all the fun times and all the laughs that lit my way. My loving family has always been there for me and they always believed I am going to realize higher places. If I had the determination to go through the tough times, it is because making you pr oud is my purpose in life. I love you all so much.
6 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ .. 4 LIST OF TABLES ................................ ................................ ................................ ............ 8 LIST OF FIGURES ................................ ................................ ................................ .......... 9 ABSTRACT ................................ ................................ ................................ ................... 10 CHAPTER 1 INTRODUCTION ................................ ................................ ................................ .... 12 Resonance and Nasalization ................................ ................................ .................. 12 Challenges of Detecting Vowel Nasalization ................................ ........................... 14 Overview of the Dissertation ................................ ................................ ................... 15 2 LITERATURE REVIEW ................................ ................................ .......................... 17 Nasalization Evaluation ................................ ................................ ........................... 17 Perceptual Evaluation ................................ ................................ ............................. 17 Instrumental Measures of Nasalization ................................ ................................ ... 20 Visualization T echniques for the Assessment of the Velopharyngeal Mechanism ................................ ................................ ................................ .... 21 Acoustic/Aerodynamic Techniques for the Assessment of Hypernasal Speech ................................ ................................ ................................ .......... 22 3 ACOUSTIC ANALYSIS OF HYPERNASALITY USING SYNTHESIZED VOWELS ................................ ................................ ................................ ................ 27 Introduction ................................ ................................ ................................ ............. 27 Acoustic Correlates of Hypernasality ................................ ................................ 27 Attempts to Quantify Hypernasality in Speech Using Acoustical Analysis ........ 28 Purpose of the study ................................ ................................ ............................... 31 Methodology ................................ ................................ ................................ ........... 32 Listeners ................................ ................................ ................................ ........... 32 Stimuli ................................ ................................ ................................ ............... 32 Perceptual Experiment ................................ ................................ ..................... 34 Acoustic Analysis ................................ ................................ ............................. 34 Statistical Analysis ................................ ................................ ............................ 34 Results ................................ ................................ ................................ .................... 35 Discussion ................................ ................................ ................................ .............. 36 4 ACOUSTIC AN ALYSIS OF HYPERNASALITY USING NATURAL VOWELS ........ 43
7 Perceptual Analysis of Hypernasality ................................ ................................ ...... 43 Purpose of the St udy ................................ ................................ .............................. 43 Methodology ................................ ................................ ................................ ........... 44 Listeners ................................ ................................ ................................ ........... 44 Stimuli ................................ ................................ ................................ ............... 44 Perceptual Experiment ................................ ................................ ..................... 47 Training sessions ................................ ................................ ....................... 47 Testing: ................................ ................................ ................................ ...... 50 Acoustic Analysis ................................ ................................ ............................. 51 Challenges in obtaining A1 P n : ................................ ................................ ......... 53 Statistical Analyses ................................ ................................ .......................... 54 Results ................................ ................................ ................................ .................... 56 Statistical Findings ................................ ................................ ........................... 56 Discussion ................................ ................................ ................................ .............. 58 5 CONCLUSIONS ................................ ................................ ................................ ..... 78 Summary of Findings ................................ ................................ .............................. 78 General Limitations ................................ ................................ ................................ . 79 Future Directions ................................ ................................ ................................ .... 79 AP PENDIX: LIST OF SENTENCES AND THEIR PHONETIC TRANSCRIPTION. ....... 82 L IST OF REFERENCES ................................ ................................ ............................... 83 BIOGRAPHICAL SKETCH ................................ ................................ ............................ 90
8 LIST OF TABLES Table page 4 1 Average frequencies of the nasal peaks as reported by Chen (1997). ............... 74 4 2 Average A1 P n values in dB for each vowel for each speak er group. ................. 75 4 3 Correlations of the Variables in the Regression Analysis. ................................ .. 76 4 4 Backward Stepwise Regression Results. ................................ ........................... 77 4 5 Linear Regression Results for Each Vowel. ................................ ....................... 77
9 LIST OF FIGURES Figure page 3 1 Hypernasality judgments for natural and synthetic vowels. ................................ 40 3 2 Distribution of vowels across hypernasality judgments. ................................ ...... 41 3 3 A continuum of the ratings of hypernasality. ................................ ....................... 42 4 1 GUI for training 1. ................................ ................................ ............................... 68 4 2 GUI for training 2. ................................ ................................ ............................... 69 4 3 GUI for training 3. ................................ ................................ ............................... 69 4 4 GUI for training 4. ................................ ................................ ............................... 70 4 5 Label Trainer GUI. ................................ ................................ .............................. 70 4 6 FF T power spectrum and LPC envelop. ................................ ............................. 71 4 7 An example of the harmonics picked for F1, F n , and F2. ................................ .... 71 4 8 An example of multiple harmonics affected by the extra pole zero pair(s) ......... 72 4 9 A spectral envelop showing a blend of F1 and F n . ................................ .............. 72 4 10 A spectral envelop showing a blend of F1, F n , and F2. ................................ ...... 73 4 11 An example of a spectral envelop where the extra peak could not be identified clearly. ................................ ................................ ................................ . 73 4 12 An example of formant location shift caused by the introduction of an ext ra pole zero pair due to hypernasality. ................................ ................................ .... 74 4 13 Raw DME Intelligibility judgments of Trial 1 and Trial 5, ordered from least to g reatest score on Trial 1. ................................ ................................ .................... 75 4 14 ................................ ................................ .... 76
10 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy T HE EFFECTIVENESS OF THE ACOUSTIC MEASURE A1 P n IN PREDICTING PERCEPTUAL JUDGMENTS OF HYPERNAS A LITY By Nour El Bashiti August 2014 Chair: William N . Williams Major: Communication Sciences and Disorders In the case of velopharyngeal dysfunction, several forms of resonance disord ers can occur. Hypernasality occurs due to resonances associated with the nasal cavities during the production of oral sounds. The amount of oral or nasal resonance in speech can be estimated by using acoustic analysis. Hypern asality is perceived primarily o n vowel segments. Several acoustic correlates for hypernasality have been reported in previous literature. The major changes in the acoustic signal have been observed in the vicinity of the first formant (F1). These changes affect the F1 and amplitude (Chen, 1997). An introduction of extra nasal resonances before and after F1 was also described ( Chen , Janet, Slifka, & Stevens, 2000). Most instruments available to assess the velopharyngeal function are hard to admini ster, costly, and involve a degree of invasiveness (Bzoch, 2004) . For more effective clinical use, it is important to develop a non invasive and affordable method to detect and quantify hypernasality. Existing instruments only provide a binary decision ind icating the presence or absence of hypernasality. Such measurement fails to effectively characterize the range of nasality that is typically seen in patients with cleft palate.
11 Acoustic analysis of speech could be an objective tool when it h igh ly correlat es with perceptual judgments of hypernasality . This work aimed at evaluating the ability of the proposed acoustic measure to explain the degree or magnitude of hypernasality in patients with cleft palate. This measure is obtained by calculating the differe nce of the amplitude of the highest harmonic closest to the first formant (A1) and the amplitude of the highest harmonic closest to the extra peak (P n ). The long term goal of this research is to develop a model to predict the degree of nasality in voice. T his will help develop better tools for assessment and rehabilitation of patients with velopharyngeal inadequacy. In this dissertation , the acoustic measure was examined in both synthesized samples and natural voice recordings of hypernasal speech. Correlat ions of this measure with perceptual ratings were calculated to estimate the reliability of using A1 P n to scale hypernasality of speech. The results suggest that A1 P n is a promising measure to detect hypernasality perceptually.
12 CHAPTER 1 INTRODUCTION Resonance and Nasalization Resonance is a physical phenomenon that modifies the sounds generated by the vocal folds. This process is what shapes the quality of the sound that is perceived during speech ( Bzoch, 2004) . The interaction between the excitation source (the vocal folds) and the vocal tract cavities determine resonance ( Pruthi, E spy Wilson, & Story, 2007) . The size and shape of these resonating cavities, namely, the pharyngeal cavity, the oral cavity and the nasa l cavity, forms the end sound that we hear as speech ( Bzoch, 2004) . Resonance disorders can be of three forms. Hypernasality, the first form, occurs when there is a greater coupling of the oral and nasal cavities during the production of non nasal (oral) s ounds. The second form is hyponasality, which occurs when there is insufficient sound energy resonating in the nasal cavity during the production of nasal consonants . Cul de sac resonance is similar to hyponasality, and it is due to an anterior obstruction in the nasal cavity. However, in this case the blockage results in preventing the sound energy from exiting through the nose while producing nasal sounds ( Pruthi, E spy Wilson, & Story, 2007) . Nasalization occurs when the velum drops to permit coupling be tween the oral and nasal cavities. When this happens, the oral cavity will still provide the major source of output but the sound acquires a noticeable nasal characteristic. The sounds that can be nasalized are usually vowels, but it can also include semiv owels such as /w/ and /j/. Vowel nasalization can be generally divided into the following three categories. Coarticulatory nasalization; this type is usually observed when nasal consonants take place next to vowels. The velopharyngeal port will be somewhat open during part of the
13 vowel that is neighboring the nasal consonant, leading to nasalization of that part of the vowel (Bzoch, 2004). The second category is called phonemic nasalization. This feature is language specific and it happens when the vowel is not close to a nasal consonant in context. In this case the vowel is phonemically or distinctively nasalized as a feature of that particular language ( Maddieson, 1984 ). Thus, in such languages minimal pairs of words with a difference in the nasalized vowe l only could be found. A good example of phonemic nasalization can be found in French. The two words beau and bon are a minimal pair that differs only in the nasalization of the vowel. If you say [bo] without nasalizing the vowel ( beau b Ã¶] with nasalizing the vowel ( bon the word. Another good example can be found in the Portuguese language. Portuguese speakers are also able to contrast nasalize d vowels with their oral versions. For instance, vi and vim produced here; however it adds a strong nasal feature to the vowel preceding it similar to the case of the French example ( Ladefoged , 1 982) . Functional nasalization is the third category; here nasality is introduced as a result of defects in the functionality of the velopharyngeal mechanism. These defects in the velopharyngeal system can be anatomical in origin. Cleft palate is an exam ple of coupling of the oral and nasal cavities resulting in an inadequate velopharyngeal mechanism. Another example is damage to the central nervous system as in cerebral palsy or traumatic brain injury, or damage to the peripheral nervous system as the ca se in Moebius syndrome ( Cairns, Hansen, & Riski, 1996) .
14 Challenges of Detecting Vowel Nasalization The exact acoustic characteristics of nasalization may vary with speaker, vowel type, and the degree of the nasal coupling (Bzoch, 2004; Watterson, Lewis, Allord, Sulprizio, & O'Neill , 2007). Consequently, vowel nasalization is considered a hard feature to study. Nasalization introduces several changes in the acoustic spectrum; a main change is represented by the introduction of zeros (Lindblom, Lubker, & Pa uli, 1977; Hawkins, & Stevens, 1985). These zeros are not always app arent as a drop on the spectrum, and are very difficult to detect due to the possibility of pole zero cancellations. Despite the fact that the articulatory requirement that causes nasaliza tion is simple, the acoustic consequences of this coupling are very complex because of the complicated structure of the nasal cavity (Watterson, Lewis, Allord, Sulprizio, & O'Neill, 2007; Chen, 1997) . The oral cavity is a single passageway . C onstrictions a long this passage are create d by the articulators to produce different sounds . Unlike the oral cavity, the nasal cavity consists mainly of two parallel channels divided by the nasal septum ( Pruthi, 2007) . The area of these two channels can vary which creat es asymmetry between them ( Dang, Honda, & Suzuki, 1994). Moreover, the nasal cavity has several paranasal cavities called sinuses that are linked to the two main nasal channels by small openings called ostia (Pruthi, 2007) . This complicated structure of th e nasal cavity introduces changes to the power spectrum of the vowel in the case of oral and nasal coupling. These changes vary by different sizes and configuration of the nasal passage of different individuals. This presents challenges in capturing the ac oustic effect of the coupling as the changes to the spectrum could vary considerably from one case to the other.
15 Overview of the Dissertation Chapter 1 introduces the problem. In this chapter a detailed description of nasalization is provided and an ex planation of justification of the importance of detecting it as well as the challenges of detecting this feature is stipulated. The anatomy of the vocal tract and its cavities is also described here. Resonance disorders stem from different etiologies ( Cair ns, Hansen, & Riski, 1996) . Hence, the crucial demand for an effective assessment tool escalates. The following question is answered across different chapters in this dissertation: Does the difference in amplitude of the first formant and the amplitude of the extra peak ( A1 P n ) explain the variability in the perception of hypernasality in the cleft palate population? The available assessment tools for velopharyngeal port deficits are described in chapter 2. The advantages and disadvantages of each technique are portrayed. The most common assessment tools draw a visual illustration of the mechanism and the deficit if pr esent (Johns, Roh rich, & Awada, 2003). This visual illustration is important in the diagnosis process and in dictating the appropriate intervention options (Bzoch, 2004). However, an illustration of the anatomical structure does not aid in providing an obj ective measure for evaluating and quantifying hypernasality perceptually. Acoustic analysis has been investigated as a tool to provide a solution for this problem. Easy administration and low cost were two of the leading reasons to consider this assessment technique. Chapters 3 and 4 discuss the effectiveness of an acoustic parameter A1 P n in scaling hypernasality. Coming at the question under investigation in two distinctive ways, speech synthesis and natural hypernasal speech were utilized. In chapter 3, an experiment utilizing speech synthesis is described. T he use of synthesized speech samples is helpful as the location of the extra pole zero pairs can
16 be set by the investigator. Here the acoustic cue under examination is manipulated to control the locat ion and extent of the extra nasal peak. Speech synthesis provides better control of the variables that interfere with assessing hypernasality in a natural speech setting (Vijayalakshmi, & Reddy, 2004). If we can account for all the changes that occur in th e power spectrum of nasalized vowels, algorithms to scale this resonant quality will be possible. Moreover, better speech models that represent the vocal tract function will be created in automated speech synthesis systems. Synthesis of hypernasal speech could be a challenging task given the ambiguity of the effects introduced by the oral nasal coupling to the energy throughout the vowel. The study of natural speech samples that are characterized to be hypernasal could help relate the change in the acoust ic parameter across a wide range of hypernasality. Chapter 4 describes an acoustic perceptual experiment using natural speech samples. The aim of the acoustic perceptual experiment is to validate the use of an acoustic parameter to predict perceptual ratin gs. Different types of scaling techniques could be used to study the perception of nasality (Zraick, & Liss, 2000). In Chapter 3, the perceptual study was conducted using a n equal interval rating scale task . In chapter 4, the experiment was conducted using a direct magnitude estimation (DME) scaling task. DME was used because it does not assume a linear continuum of perceptual quality (Schiavetti, Metz, & Sitler , 1981). More details are provided in chapters 3 and 4 regarding the rating scales used for both experiments. Chapter 5 summarizes the general findings of both studies presented in chapters 3 and 4. As conclusions drawn from both studies, future directions of this work are described.
17 CHAPTER 2 LIT E RATURE REVIEW Nasalization Evaluation Instruments e mployed to measure hypernasality in speech give a binary decision for the presence or absence of hypernasality. However, they fail to provide a measure that scales this resonant property. Some of these methods are invasive and expensive to administer (John s, Roh rich, & Awada, 2003). And in some cases, they are affected by multiple factors that might influence the efficacy of the instrument used (Bzoch, 2004; Watterson, Lewis, Allord, Sulprizio, & O'Neill , 2007). In this chapter, different approaches to eval uate nasalization will be discussed. Namely, perceptual evaluation and instrumental techniques of visualizing the velopharyngeal mechanism will be described. Perceptual Evaluation , 2004). Listener judgment can be considered as a form of indirect observation. Like voice disorders, resonance problems are classified as perceptual qualities, and listener judgment has the highest face validity for assessing this quality, that is, they a ppear to be a good indicator of the magnitude of the effect of velopharyngeal function inadequacy on communication ( Lee, Whitehill, & Ciocca, 2009) . Perceptual judgment is a key factor for documenting the severity of hypernasality ; however, obtaining high reliability is difficult to accomplish. There are multiple factors that could affect perceptual judgments . These factors are associated with the listener, the rating task, or the interaction of listener and task factors (Eadie, & Baylor, 2006). Listener fa ctors include aspects that could be formed by training or experience . T hese include, the specific
18 internal standards of the listener for the property being judged, the discrete perceptual habits and biases of the listener, and the general sensitivity to th e feature being rated (Eadie, & Baylor, 2006; Kreiman, Gerratt, Kempster, Erman, & Berke, 1993). Other l istener factors that are considered as random errors include listener fatigue, attention lapses, and transcription errors . Task factors concern issues such as how well the rated percepts were defined, the context in which judgment is to be made, the resolution and specificity of the rating scale (Eadie, & Baylor, 2006; Kent, 1996). The interaction between listener factors and task factors is possible, fo r instance, the diversity of decoding different points on the same rating scale by different participants and differences in the way they use the scale might occur (Eadie, & Baylor, 2006). The suitability of perceptual analysis as a principal tool for iden tifying abnormal speech patterns and providing differential diagnosis and treatment programs is controversial. Due to the difficulty of achieving a statistically high reliability of perceptual judgments, it could not be the primary means for the assessment and treatment of velopharyngeal insufficiency ( Kearns, & Simmons, 1988) . In the study of vocal quality, Gerratt et al. (1993) has suggested that listeners use an internal standard when rating stimuli. They tend to make their perceptual judgments by match ing the stimulus to that internal standard. The human internal standards can be shaped by many variables. This makes reliability of perceptual judgments hard to accomplish for different qualitative dimensions including hypernasality. Judgments of hypernasa lity can be influenced by the accuracy of articulation, age and gender of the speaker , some acoustic properties of the stimulus, such as its fundamental frequency,
19 the length and type of the speech sample as well as the rate at which the sample was uttered , anxiety level, and the degree of velopharyngeal insufficiency (Bzoch, 1979). When judging hypernasality, listener reliability was also found to be influenced by the length of the stimuli. It appeared to be higher for sentences than for words or isolated vowels (Counihan, & Cullinan, 1970). The coupling of the oral and nasal cavities has a different impact on different vowels. Maeda (1982c), for example, compared the change in the power spectrum of high vowels and low vowels as a result of nasalization. H is findings indicated a reduction in the overall energy of the power spectrum for both types of vowels. However, the subsequent acoustic effect of the oral nasal coupling was different for low vowels than for high vowels. The spectral envelop for low vowel s was better conserved than high vowels when both vowels were nasalized. The primary effects of hypernasality occur in the low frequencies (House and Stevens, 1956; Fant , 1960; Maeda, 1982c; Hawkins and Stevens, 1985; Stevens, Andrade, & Viana 1987a). High vowels have a first formant in the low frequencies. Consequently, high vowels demonstrate more change to their spectral envelop as a result of hypernasality as opposed to low vowels (Maeda, 1982c). This fact might raise uncertainties when we attempt to ev aluate perceptual judgments of hypernasality (Pruthi, 2007). Do listeners use the same threshold of nasalization across all vowels to judge hypernasality? It is still unclear what exactly happens (Pruthi, 2007). Accordingly, one could argue that perceptual judgments of hypernasality might be affected by the type of vowel presented in the stimuli. To solve this uncertainty, studying the effect of vowel type on the ratings of hypernasality is warranted.
20 Two key factors must be taken into consideration when conducting perceptual assessment of hypernasality . Both formal academic training in the nature and causes of hypernasal speech and the clinical experience of the listener in judging this resonant quality affect reliability of perceptual judgment (Lewis, Wa tterson, & Houghton, 2003). When both subject knowledge and training are present, reliability increases as compared to when one or neither exists. Reliability of perceptual judgment of hypernasality is more influenced by listener experience than academic t raining (Lewis, Watterson, & Houghton, 2003; Laczi, Sussman, Stathopoulos, & Huber, 2005). Instrumental Measures of Nasalization Perceptual evaluation is subjective (Bzoch, 2004). As a result, the probability of error increases, which raises the need for a more objective assessment tool. Instrumental assessment exhibits more reliability in decision making during diagnosis and providing feedback during speech intervention (Bzoch, 2004). Recently, substantial research aiming to find valid and reliable instru mental measure s of velopharyngeal disorders has been reported . The goal of the instrumental assessment is to obtain anatomic data regarding the adequacy of velopharyngeal mechanism (Bzoch, 2004). This information includes obtaining values of velar length, description of elevation of the palate, size of the velopharyngeal gap, the lateral and posterior pharyngeal walls movement, nasopharyngeal depth and pattern, and level of closure (Bzoc h 2004). Two techniques are used in the diagnostic process: direct and indirect ( Johns, Rohrich, & Awada, 2003) . Direct methods are those that allow the visualization of the velopharyngeal port and the structures that work together to close it. Additionally, direct methods allow the investigator to observe how these structur es move during various speech and non -
21 speech activities ( Johns, Rohrich, & Awada, 2003 ; Bzoch, 2004) . On the other hand, indirect instrumental techniques provide information that aids the investigator to make postulations about the velopharyngeal function by observing the vocal tract. Velopharyngeal dysfunction can usually be diagnosed by observing certain indications in speech . These indications comprise of the presence of hypernasality, presence of compensatory misarticulations, and presence of nasal air emission, we a k oral pressure for high pressure consonants, and facial grimaces. Although r esonance disorders can be recog nized during speech evaluation, further investigation of the case is needed. Vital aspects to consider are the cause, size, and locati on of the velopharyngeal opening. This information is needed so that the appropriate intervention can be determined. Visualization Techniques for the Assessment of the Velopharyngeal Mechanism Imaging options include videofluoroscopy and fiberoptic nasoend oscopy (Bzoch , 2004). The proficiency of the velopharyngeal closure is assessed using the radiographic assessment technique of Videofluoroscopy. Here, the velopharyngeal mechanism is observed during motion. In order to get the required contrast in the vide ofluoroscopic image, the structures are coated using barium. This procedure aids in the assessment of several forms of velopharyngeal insufficiency, including cleft palate. Videofluoroscopy is performed as part of surgical planning that will be utilized in tailoring the ideal treatment method (surgical, prosthetic, or behavioral therapies) ( Johns, Rohrich, & Awada, 2003 ). Another advantage of videofluoroscopy is that it permits the investigator to observe the tongue. The reason for observing the tongue duri ng this process is that it can be part of a compensatory mechanism that contributes in the closure of the velopharyngeal port. This information may be missed with other assessment tools.
22 However, although videofluoroscopy is a very useful method, it has th e apparent disadvantage of exposure to radiation ( Johns, Rohrich, & Awada, 2003 ; Bzoch, 2004). Fiberoptic nasoendoscopy is another instrumentation that is utilized in studying the velopharyngeal inadequacy. This technique provides the investigator with a s uperior view of the velopharyngeal port structures including velum, posterior pharyngeal wall, and lateral pharyngeal walls ( Bzoc h , 2004 ). It also aids in observing the velopharyngeal port closure pattern (Havstam, Lohmander, Persson, Dotevall, Lith, & Lilja, 2005). Nasoendoscopy is not a painful procedure; however, it can cause minor discomfort ( Johns, Rohrich, & Awada, 2003 ). It is especially essential post surgery as it permits for direct visualization of the velopharyngeal sphincter. It is also impor tant for visualization of the area during placement of a pharyngeal flap. Another advantage of nasoendoscopy is that the subject can be engaged in connected speech during the procedure, so inconsistent closure pattern can be detected (Havstam, Lohmander, P ersson, Dotevall, Lith, & Lilja, 2005). W hen compared to videofluoroscopy, fiberoptic nasoendoscopy is more invasive to some extent . Also, the procedure calls for cooperation from the patient to get an acceptable examination (Bzoch, 2004). Acoustic/Aerodyn amic Techniques for the Assessment of Hypernasal Speech Temporal and spectral characteristics of hypernasality have been inspected using more than a few methods (Warren, & Dubois, 1964). Although these techniques and measures have provided important inform ation on temporal characteristics of hypernasal speech, they are seldom operated in clinical settings , m ainly because they retain a degree of procedural complexity and they are unavailable in such settings (Bae, Kuehn, & Ha, 2007).
23 The pressure flow techni que described by Warren and Dubois (1964) observes the aerodynamic aspect in both the oral and nasal cavities during speech. It provides quantitative data regarding overall velopharyngeal port area. The theoretical basis behind this technique was built upo n principles of fluid mechanics. Their hypothesis suggested that a n estimate of the size of an existing velopharyngeal gap could be calculated during speech production using the pressure flow technique . They calculated t he size of the opening by obtaining the differential pressure and the simultaneous rate of flow across the velopharyngeal port (Warren, & DuBois, 1964 ). When the velum elevates and the velopharyngeal port closes in normal cases , a pressure difference between the oral and nasal cavities occur s. Here the pressure at the oral cavity can be estimated by respiratory effort and it is usually equal to about 3 to 8 cmH 2 O . Pressure in the nose is atmospheric or zero since there is no air escape to the nasal cavity. In the case of oral and nasal coupli ng the difference in the pressure between the two cavities gets smaller as air escapes to the oral cavity (Warren, 1979) . Throughout the vocal tract, constrictions are formed during different speech productions (Bzoch, 2004) . The velopharyngeal sphincter is an example of such constrictions. Once air goes through these constrictions, they produce sphincter type airflow (Warren, & DuBois, 1964) . The relationship between the measures air pressure to the airflow throughout the vo cal tract constrictions can be used to estimate the size of the velopharyngeal opening (Warren, 1976) . The values for the oral and nasal pressures are measured using two catheters. One catheter is placed in the mouth and the other catheter is secured with a cork in one nostril. These catheters are attached to a pressure transducer. Nasal airflow is measured by placing a cork secured tube into the other nostril. This tube is connected
24 to a heated pneumota chograph. By applying the hydrokinetic principles to t hose values, the opening size can be estimated (Bzoch, 2004) . Nasalance is a measure used to acquire indirect information about the velopharyngeal port function (Bzoch, 2004). This measure provides information about the acoustic interactions of o ral and nasal cavities during speech production (Schneider, & Shprintzen, 1980; Dalston, & Warren, 1986; Fletcher, 1976). Tonar II, developed by Fletcher (1976), provides this nasalance score. The Tonar II was the basis for the development of another devic e, the Nasometer, which is a tool mainly used to obtain nasalance scores. Nasometer (KayPENTAX, Lincoln Park, NJ), is a computer based device that has two microphones separated by a plate. The output measure is the ratio of the amplitude of the acoustic en ergy resonated through the nares, which is collected by the upper microphone, to the amplitude of the acoustic energy resonated through the oral cavity that is collected by the lower microphone. A perceptually evident hypernasality have a high nasalance s core that is defined as 32% or higher. Although nasalance scores is widely used in clinics to detect resonance problems, meaningful information can be attained from it only when nasalance scores of two subjects or nasalance scores of the same subject obtai ned in two different times are compared (Dalston, Warren, & Dalston, 1991; Bzoch, 1979). A key point to keep in mind when interpreting nasalance scores, is the variability influenced by the machine used such as the model, appropriate calibration, variabil ity introduced by the test procedure, between subjects variability such as that resulting from age, sex, and dialect differences, and within subjects variability including degree of
25 nasal congestion or nasal patency at the time of testing ( Seaver, Dalston, Leeper, & Adams, 1991 ; Lewis, Watterson, & Blanton, 2008). Another approach to quantify hypernasality is the Horii Oral Nasal Coupling Index (HONC). The HONC is not commercially marketed and has not been as widely used by clinicians as an assessment tool of hypernasality ( Laczi, Sussman, Stathopoulos, & Huber, 2005; Sussman, 1995; Mra, Sussman, & Fenwick, 1998 ). The (HONC) measure is a ratio of the nasal vibration level with the vibration level at the throat. These values are collected by placing accelerom eters on both the nose and throat (Horii, 1980; Horii, & Lang, 1981). The information obtained by this technique is conveyed as a decibels as a function of time (Horii, 1980). The ratio is multiplied by a correction factor to normalize the accelerometric signals to account for individual differences. The correction factor was calculated by the ratio of the root mean square of the accelerometric signal at the throat to th e root mean square of the nasal accelerometric signal during the sustained production of [m] (Horii, 1980). An advantage of this measure is that it accounts for individual differences that stem from to the nature of nasal tissue and vocal intensity (Horii, 198 1 ; Horii, & Monroe, 1983; Sussman, 1995), and variation caused by gender differences (Sussman, 1995). An objective analysis technique is most useful when the measure utilized for the assessment provide s information specific to the velopharyngeal mechanism and its function. This information is considered to be of great value when it also correlates highly with perceptual judgments ( Litzaw, & Dalston, 1992 ; Seaver, Dalston, Leeper, & Adams, 1991) .
26 T he next chapter will discuss sev eral acoustic techniques that were used to obtain a measure that could relate acoustic characteristics of a voice signal to perceptual ratings by listener judgments to reach better assessment of the velopharyngeal mechanism .
27 CHAPTER 3 ACOUSTIC ANALYSIS OF HYPERNASALITY USING SYNTHESIZED VOWELS Introduction Acoustic Correlates of Hypernasality In the case of velopharyngeal dysfunction, an escape of sound energy originally guided orally, alternatively passes through the nasal cavity. In this case, listener s perceive speech as being hypernasal (Bzoch, 2004).The degree of oral or nasal resonance in speech can be judged through acoustic analysis of speech. The acoustic effects of hypernasality vary across speakers, phonetic contexts, and with the sound that is under examination (Bzoch, 2004; Watterson, Lewis, Allord, Sulprizio, & O'Neill, 2007). Hypernasality is often perceived on vowels. Several acoustic correlates for hypernasality in vowels have been reported in the literature (House, & Stevens, 1956; Hattor i, Yamamoto, & Fujimura, 1958; Fant, 1960; Fujimura, & Lindqvist, 1971 ; Lindblom, Lubker, & Pauli, 1977; Hawkins, & Stevens, 1985). Major changes in the acoustic signal have been observed in the vicinity of the first formant. These changes include an incre ase in the first formant bandwidth that is related to the reduction of that (Lindblom, Lubker, & Pauli, 1977; Hawkins, & Stevens, 1985). In the range of 200 500 Hz (Hattori, Yamamoto, & Fujimura, 1958) or the range of 700 2000 Hz (House, & Stevens, 1956), the introduction of an extra pole zero pair (nasal resonances) was also described. The exact location of the pole and zero depends on the vowel type and the size of the ope ning between the nasal and oral cavities. It is argued that the different locations for the pole have different effects on the perception of hypernasality (Hawkins,
28 & Stevens, 1985; Fant, 1960). A change in the amplitude of the second and third formants an d a shift in their frequency were observed as well as an extra pole zero pair in the third formant region. Hypernasality introduces a downward shift of the second formant (F 2 ) and an upward shift of the third Formant (F3). This results in widening of the F 2 F3 region (Bognar, & Fujisaki, 1986). The extra poles at high frequencies appear as small notches in the spectrum ( Stevens, Fant, & Hawkins, 1987b , Pruthi, 2007 ; Karim, Kaykobad, & Murshed, 2013 ). The major observed effect of nasalization on the frequency center of gravity or a spectral flattening in the range of 300 2500 Hz. This translates to the reduction in overall amplitude of the vowel (Pr uthi, Espy Wilson, & Story, 2007; Hawkins, & Stevens, 1985). Most studies evaluating the acoustic correlates for nasality in normal speakers gives a binary classification of phonemes by reporting the presence or absence of nasalization. However, in speake rs that exhibit hypernasal speech, it is necessary to quantify and scale the degree or severity of nasality. None of the methods currently available qualify as a universally accepted approach to quantify the perception of hypernasality vide an objective measure to scale this resonant quality (Seaver, Dalston, Leeper, & Adams, 1991). Attempts to Quantify Hypernasality in Speech Using Acoustical Analysis In spite of the fact that a number of acoustic characteristics had been associated wi th hypernasality, only a few attempts to quantify hypernasal speech by analyzing the spectral energy of the vowel have been made (Lee, Ciocca, & Whitehill, 2003; Lee, Wang, Yang, & Kuo, 2006; Vogel, Ibrahim, Reilly, & Kilpatrick, 2009). In this chapter, se veral methods describing and quantifying hypernasality will be discussed. One
29 method is the one third octave technique. One third octave spectra analysis was developed by Kataoka and his colleagues in 1980s, the method suggests transforming the acoustic si gnal into its power spectrum by means of Fast Fourier Transform analysis (FFT), and then it is filtered into one third octave bands in a certain frequency range. A frequency versus amplitude plot is then created. For each band created, the mean amplitude i s calculated. To account for individual variability caused by the differences in loudness, the spectrum in this method is normalized by adjusting the amplitude of the band containing the fundamental frequency to 0 .0 dB (Kataoka, Michi, Okabe, Mirura, & Yos hida, 1996; Kataoka, Zajac, Mayo, Lutz, & Warren, 2001). As a result, in the adjusted spectrum the band that contained F0 had a value of 0.0 dB. Although this technique could be used to distinguish normal versus hypernasal speech, it only achieved modest l evel of agreement with perceptual judgments, and only for two vowels, namely /i:/ and /a:/ (Kataoka, Michi, Okabe, Mirura, & Yoshida, 1996). Using this technique, the researchers compare the differences between the spectral profile s of individuals with hyp ernasal speech and controls by examining the one third octave bands. They mainly observed an incre ase in intensity between the first and second formants around 1 kHz and a reduction in signal intensity between the second and third formants for the nasalize d vowel ( Kataoka, Michi, Okabe, Mirura, & Yoshida, 1996; Kataoka, Zajac, Mayo, Lutz, & Warren, 2001; Vogel, Ibrahim, Reilly, & Kilpatrick, 2009) . Another acoustic analysis technique used is the voice low tone high tone ratio (VLHR) that was developed by L ee et al. in 2003. The VLHR technique primarily divides the spectrum of a vowel into low frequency and high frequency power using a predetermined cutoff frequency, and is expressed in decibels. Research using this
30 method suggests that the VLHR index increa ses as hypernasality increases. The cutoff frequency is calculated by using the fundamental frequency (f0) times the geometric specific cutoff point of 600 Hz to divide the spectrum (Lee, Ciocca, & Whitehill, 2003; Lee, Wang, Yang, & Kuo, 2006; Vogel, Ibrahim, Reilly, & Kilpatrick, 2009). The power in each region was calculated by the summation of intensity for that frequency range. Higher VLHR measures are observed in nasal ized vowels when compared with oral vowels. Nonetheless, the VLHR technique did not show any significant differences between perceived hypernasality and normal resonance when comparing participants with a resonance deficit to normal non nasal controls (Lee , Ciocca, & Whitehill, 2003). As mentioned before, hypernasality causes extra peaks in the power spectrum of a vowel. The prominence of this peak depends on two acoustic cues; the widening of the first formant bandwidth, and the location of the added pole zero pair and the distance between them (Pruthi, Espy Wilson, & Story, 2007; Chen, Janet, Slifka, & Stevens, 2000). The first formant amplitude (A1) is inversely related to its bandwidth (B1), and the amplitude of the extra peak (P) is directly related to the distance between the pole and zero. Chen (1995) utilized this information to propose a measure that involves both cues. This measure looks at the difference in amplitude between the first formant and the amplitude of the extra peak (A1 P0/A1 P1). The was to be able to synthesize more natural speech and to develop an accurate speech production model by examining the effect of the nasal tract on the system (Chen, Janet, Slifka, & Stevens, 2000).
31 It is assumed that with a greater veloph aryngeal coupling the prominence of the extra peak increases at the lower frequency range in the spectrum of a nasal vowel (Chen, 1995; Chen, 1997; Pruthi, 2007). A1 P0/A1 P1 measure analyzes the signal at the frequency domain. It is important to keep in m ind that the source properties (laryngeal system) affect this value. For example, it is expected that an increase in the open quotient of the glottal waveform affects the amplitude of the first harmonic, which in turn affects the overall power spectrum of the vowel (Chen, 1996). For a nasal vowel the value of A1 P0/A1 P1 would be smaller than for a n oral vowel. Consequently, if an extra peak occurs at a lower frequency range (lower than the first formant frequency) , its amplitude will increase by the increa se of the open quotient. This in turn, will lead to an even smaller A1 P0/A1 P1 value (Chen, 1995; Chen, 1997; Pruthi, 2007). Different vowels have differences in the energy along the power spectrum. The location of the formants is different for different vowels. For high vowels, the first formant is located at lower frequencies, so the expected extra peak will take place between the first and second formants (Chen, 1995; Chen, 1997). For these types of vowels the peak amplitude was represented by (P1) to indicate that it occurred after the first formant. For low vowels the first formant occurs in higher frequencies, so the location of the extra peak will be below the first formant. For low vowels, the amplitude of the extra peak was represented by (P0) (Ch en, 1995; Chen, 1997; Pruthi, 2007). Purpose of the study This experiment investigated the effect of extra peaks in the transfer function on the perception of hypernasality. To create a range of hypernasality certain harmonics were excited in the transfe r function of each vowel. These peaks were introduced by manipulating the amplitude of a harmonic in three different locations. The amplitude of
32 those harmonics was increased with one of three predetermined increments as discussed later in this section. Th e objective o f this experiment was as follows: Does the location of the extra peak affect perceptual ratings of hypernasality? Does the increase in the amplitude of the extra peak result in an increase in perceived hypernasality? Methodology Listeners Listeners were five native speakers of American English. They were undergraduate and graduate students at the University of Florida, all of which majoring in speech and hearing sciences. They ranged between the age of 18 and 30 on the day of testing. All l isteners were screened for hearing loss (air conduction pure tone thresholds below 40 dB HL at 500 Hz, 1 kHz, 2 kHz, and 4 kHz). Stimuli Two sets of stimuli were recorded. Three vowels, / comfortable pitch and loudness for 10 seconds were obtained from individuals diagnosed with a cleft palate, with ages ranging from 4 to 30 years. Stimuli were recorded during an Operation Smile medical mission in Aswan, Egypt from each of the patients who came to the screening location of the mission. Data collection was approved by University of Florida IRB01 #71 2010. The second set of stimuli consisted of six vowels (/i/, /u/, / /, / Ã¦ /) that were recorded for 4 normal English speakers (2 male and 2 female), with ages ranging f rom 21 to 30 years. These speakers were undergraduate students at the Speech, Language and Hearing Sciences department at the University of Florida. All recordings were made using a Marantz professional solid state recorder (PMD671) with a head worn conden ser microphone
33 (Audiotechnica; ATM73a). Recordings were made at a sampling rating of 44100 Hz and 16 bit quantization level. For the samples obtained from normal speakers, MATLAB version 8 was used to edit the recordings and generate a set of synthesized v owels. A Fast Fourier Transform (FFT) analysis was performed for a one second sample extracted from the original recording for each vowel. The size of the FFT was set to 44100. This FFT size was selected to control the bin size. The bin size is equal to th e sampling rate divided by the number of bins (FFT size). Consequently, the size of each bin was 1 Hz. To create the set of synthesized vowels, the amplitude of a harmonic closest to certain frequencies was modified. These frequencies were chosen according to the frequency of the extra peak reported in the literature (Chen, 1995) . These frequencies were in three different groups according to the location of the extra peak required: Peak 1: 950 Hz for high vowels, 215 Hz for low vowels, Peak 2: 1150 Hz for h igh vowels, 415 Hz for low vowels, and Peak 3: 1350 Hz for high vowels, 615 Hz for low vowels. There was no manipulation of phase and the amplitude was varied by a factor of 20, 30 or 40. This change in intensity is comparable to a change of 13 dB (10Log 10 20), 14 dB (10Log 10 40) and 16 dB (10Log 10 60) respectively . After the introduction of the extra p eak , an inverse FFT of the modified magnitude spectrum was computed and the signal was written into a wav file. These stimuli were later converted to a dat form at to be used in the perceptual study. A total of 237 stimuli were used in the later perceptual part of the experiment; 21 natural and 216 Synthetic (4 speakers X 6 vowels X 3 peak locations X 3 amplitude levels). Two experienced speech language pathologis ts in resonance disorders were asked to listen to the synthesized vowels. They both recommended the
34 removal of seven of the synthesized stimuli because they were perceived with a buzz like sound. Consequently, those seven stimuli were removed from the pool of vowels chosen for the perceptual study. Perceptual Experiment Data collection took place in the Speech Acoustics Laboratory at the University of Florida. The hardware used for data collection was a TDT System III (Tucker Davis Technologies, Inc.) using the software program SykofizX (Tucker Davis Technologies, Inc.) a nd a transducer ER2 insert earphone (Etymotic Research, Inc.). Stimuli were on a 7 point rating scale, where 1 represents no hypernasality and 7 represents severe hy pernasality, using a computer keyboard/mouse. Stimuli were presented 5 times in a random order monaurally in the right ear in a sound treated booth. The testing process required approximately 2 hours for each listener. This listening experiment was approve d by University of Florida IRB02 #2009 U 1305. Acoustic Analysis The analysis of the recordings was implemented in MATLAB version 8 to estimate a number of different acoustic features. These features include first formant frequency (F1), first formant ampl itude (A1), second formant frequency (F2), second formant amplitude (A2), extra peak frequency (F n ), and extra peak amplitude (P n ). Statistical Analysis Inter judge reliability was defined as the degree of consistency between average ratings of hypernasality severity between listeners for all the stimuli. Intra judge reliability was defined as the degree of consistency within listeners between the five trials of hypernasality severity rating for all the stimuli.
35 To determine the effects of peak l ocation, peak amplitude, and vowel type on the perception of hypernasality , a three way ANOVA with post hoc comparisons using correction was used. The independent variables of this study were the extra peak location, the extra peak amplitude, and vowel type. A ll statistical procedures were carried out using SPSS version 18.0 (SPSS Inc., Chicago, IL) . Results The average of responses across the five trials for each stimulus was calculated relation coefficient for the five raters was produced. The matrix showed significant inter judge correlations (p < 0 .01) between all five listeners. Average inter subject correlation was 0.428 and average intra subject correlation was 0.483. Both natural and synthetic stimuli represented a range of nasality judgment spread between low and high nasality rating (Figure 3 1). Data represented a continuum of nasality ranging from 2 to 6 (2 represents low nasality and 6 represents high nasality) (Figure 3 2). F igure 3 3 shows the mean of hypernasality ratings averaged across listeners for each stimulus . The overall distribution of the ratings is demonstrated by the error bars. The error bars show that refle ct large variability for the stimuli presented. Signific ant main effects were found for vowel type [ F ( 5, 123) = 24.129, < 0.001]. The perceived nasality on / 2 ). Significant results were also observ ed for peak location [ F ( 2 , 123) = 5.739, p=0.004]. Pairw ise comparisons show t hat P1 and P3 contributed to higher perceived nasality across all vowels. On the other hand, the effect of different amplitudes of the extra peak
36 on the nasality judgment was found insignificant across vowels. No interaction was found for the effect of peak location and amplitude variation. Discussion The results of the study suggest that the extra pole location can influence the perception of nasal ity on vowels. The location of the extra peak in the frequencies around 950 Hz and 1500 Hz for high vowels and around 215 Hz and 615 Hz for low vowels had the most influential effect on hypernasality ratings. On the other hand, the There was no effect for the interaction of the peak location and amplitude as well. The peak amplitude was modified to be increased by 13, 14, and 16 dB. Other amplitude variations should b e investigated in terms of their effect on hypernasality perception. More specifically, the range of the intensities of the extra peak should be expanded to include lower and higher variations. Moreover, another tactic to use in the process of modifying th e extra peak amplitude is to adjust the difference between A1 and P n in predetermined increments rather than only changing the amplitude of the peak without taking the amplitude of the first formant into consideration . Chen (1995) reported that for a high judgment of perceptual hypernasality, a difference of no more than 8 to 10 dB is needed between A1 and P n . Figure (3 3) shows the average ratings for the five listeners for each vowel stimuli. It also indicate that the standard error of the mean (SE) was l arge for most of the stimuli. This could be explained by the small sample size (five listeners). The SE is a measure of the precision of the sample mean. It is calculated by dividing the standard deviation by the square root of the sample size. So by decre asing the sample size the SE is expected to increase.
37 Results show that l ow vowels were perceived with more hypernasality than high vowel s in thi s experiment . This result contradicts the work reported in literature. In general, h igh vowels are rate d more hypernasal than low vowels ( Watterson, Lewis, Allord, Sulprizio, & O'Neill , 2007 ) . Previous studies have shown that the acoustic consequences for the coupling of the oral and nasal cavities differ for different vowel types. The spectrum of the low v owel / / is less influenced by the coupling when compared to the vowel /i/ ( Fant, 1960; & House & Stevens, 1956 ). This suggests that low vowels need a larger coupling than high vowels for a given nasality (House & Stevens, 1956; Maeda , 1993; & Chen, 1995 ) . Accordingly, high vowels would be rated more hypernasal than low vowels for the same conditions. One study that reached similar findings showed that low vowels are perceived as nasal more often than high vowels for natural speech vowels (Ali et al., 1971; Lintz and Sherman, 1961). The different acoustic effects of oral nasal coupling could influence listener reliability ( Watterson, Lewis, Allord, Sulprizio, & O'Neill , 2007) . Another important aspect for judging hypernasality reliably is the c linical exper ience of the rater ( Lewis, Watterson & Ho u ghton, 2003; Bradford, Brooks, & Shelton, 1964; Fletcher, 1976; Dalston, Warren, & Dalston, 1991) . Although all the listeners of the study were undergraduate students of speech and hearing sciences, none of them ha d clinical experience with patients with resonance disorders. This could have impacted their inter and intra judge reliability of hypernasality ratings. However, it is important to note that in spite of the low magnitude of the inter rater reliability in the study, the correlation between the ratings of all listeners was significant. This means that they have rated the stimuli in the same ranking regardless of the rating value they assigned for each stimulus.
38 The outcomes of the study may have been limited by several factors. Using isolated vowels as the speech stimuli mi ght have affected the perceptual ratings of hyper nasality. In spite of the fact that hypernasality is heard on the vowel segment, the use of C VC words or longer stimuli may have improved the judgments of hypernasality. Nasality ratings may vary depending on the nature of the context it occurs . For example, differences in nasality judgments for different types of stimuli (a c onnected speech passa ge, isolated vowels, CVC syllables produced in isolation, and CVC syllables produced in connected speech ) has been reported previously (Carney, & Sherman, 1971). Watterson et al. (1999 ) showed that the magnitude of the correlation between nasalance scores and the length of the stimuli was reduced with decreased length . The length of the stimuli is another variable that has been shown to influence listener reliability. Listener reliability for rating hypernasal speech is higher for sentences than for single words, and higher for single words, than isolated vowels ( Counihan & Cullinan , 1970 ) . The percept ual judgments shown in Figures 3 1 and 3 2 , suggest that a linear relationship of perceptual judgments of hypernasality is not present . The use of an equal in terval scale might have limited the results of the study since it assumes a linear partition o f the continuum of hypernasality . Equal interval scales have been widely employed in studies investigating perceptual judgment of resonance disorders (Kuehn & Mol ler, 2000 ). This type of scaling tasks employs a five, seven or nine points; where the highest point indicates the most severe hypenasality (Kuehn & Moller, 2000; McWilliams et al., 1990; & Whitehill, Lee, & Chun, 2002), and t he listener is required to assign a point on the scale to each stimulus.
39 Psychophysicists suggests two types of perceptual continua, prothetic and metathetic (Stevens, 1974). Prothetic continua are related to intensity or quantity. This type of continua i s related to an additive physiological process, where new excitation is added to an already existing excitation ( Stevens, 1974 ). Dissimilarly, metathetic continua are related to position or quality. This type of continua is related to a substitutive physio logical process, where new excitation is exchanged for an excitation that has been eliminated ( Stevens, 1974 ). continuum into equal segments. So in light o f the present findings of the current study, an alternative r ating scale , direct magnitude estimation (DME), which is a ratio scale, should be used to obtain perceptual judgments of hypernasality . Speech synthesis proved successful in discovering acoustic cues for speech perception. The advant age of synthetic speec h is that it enables the control and manipulat ion of the acoustic variables related t o a specific perceptual feature without confounding effects f rom other co occurring problems often found in disordered speech. In this study synth esis of vowels was conduc ted and provided valuable information about the effect of the location and amplitude of the extra peak on the perceptual judgments of hypernasality. Using synthesis for this purpose might show better results if the possibility of synthesizing whole sentenc es was available. Researchers have investigated the acoustic features of hypernasality by developing algorithms to analyze this resonant feature by studying speech of the population of the hear ing impaired (Chen, 199 5 ), synthesized hypernasal speech ( Vijayakshmi, & Reddy, 2004 ) and nasalized vowels resulting from coarticulation ( Hawkins, & Stevens, 1985 ). This s t udy utilized synthesized vowels to investigate
40 perceptual ratings of hypernasality. It is important to keep in mind that once the acoustic cue s related to hypernasality have been identified through a speech synthesis approach, the value of such acoustic cues in evaluating hypenasality in natural speech must be explored. Figure 3 1. Hypernasality judgments for natural and synthetic vowels.
41 Figure 3 2. Distribution of vowels across hypernasality judgments.
42 Figure 3 3. A continuum of the ratings of hypernasality. Symbols indicate the perceived hypernasality severity averaged across listeners and bars indicate standard error of the mean (SEM).
43 CHAPTER 4 ACOUSTIC ANALYSIS OF HYPERNASALITY USING NATURAL VOWELS Perceptual Analysis of Hypernasality The distinctive ness of nasalized vowels comes from the fact that they are the only vowels where air flows through two channels radiating from the nose and mouth. This is usually a consequence of coupling of the oral and nasal cavities. This in turn introduces additional pole zero pairs in the transfer function. The major consequent acoustical effects of this phenomenon include a reduction of the amplitude of the first formant (A1), and the rise of a nasal poles around 250 Hz and 1000 Hz ( Hawkins, & Stevens, 19 85 ). This na sal pole manifests as an extra peak on the spectrum. The amplitude of this extra peak is expressed as P n . The difference in these two values, A1 P n , is proposed to be an acoustic correlate of vowel nasalization (Chen, 1995). Quantifying the degree of hype rnasality in vowels is a difficult task to accomplish . This is m ainly because it depends on several factors such as the individual anatomical structure differences, the size of the coupling between the two cavities, vowel identity, and phonetic context ( Pruthi, 2007 ). For example, the interaction between the pole zero pairs is complicated by the shape of the oral cavity for the same subject by the change in the area of the coupling or by different vowels during the production of speech (Pruthi , 2007 ). P urpose of the Study The aim of this study was to determine if the proposed measure A1 P n could explain the variability of hypernasality for the population of cleft palate. This study investigated the effect of this acoustic cue on the perception of hyper n asality in natural speech samples . The long term goal of this research is to
44 develop a model to predict the degree of hypernasality in voice. This will help foster an improvement in the tools used for the assessment and rehabilitation of patients with a ve lopharyngeal inadequacy and patients with cleft palate . Methodology Listeners Ten normal hearing native speakers of Arabic were recruited for this study. They ranged between the age of 18 and 35 on the day of testing. Listeners were undergrad uate student s of Speech Language Pathology at the University of Jordan. The listeners were recruited by referral from two assistant professors in the department. Stimuli Sentences were recorded during two Operation Smile medical missions in Amman , Jordan. Obtaining th e data through the missions was approved by University of Florida IRB01 # 201200296 . These recordings were made using a TASCAM DR 40 professional recorder with a head worn condenser microphone (Audiotechnica; ATM73a). All of the recordings were sampled at a rate of 44.1 kHz and a 16 bit quantization level . The microphone to mouth distance was approximately 5 cm at an angle of about 45 degrees. Subjects who could read fluently were asked to read the stimuli; subjects who could not read we re instructed to re ci te the sentences after the examiner. Thirty eight sentences were collected from each of the patients who came to the screening location of the mission . These sentences were composed to fall in five categories following the sonority hierarchy ( Selkirk, 1984 ) . Nine s entences were composed using mostly stop sounds, ten sentences were composed using mostly fricatives and affricates , two sentences were composed using glides and liquids , eight sentences were composed mostly using nasal consonants , and nine sentences were
45 composed using a mix of both oral and nasal sounds (Appendix ) . All the five categories of the sentences contained mainly one of the four long vowels ( / a: / , / :/ , / i: /, and / u: / ). Each sentence contained two or three instances of the same vowel. The analysis explained in later sections was performed on the long vowels in the sentence stimuli. Each sentence contained two or three instances of the same vowel. The total number of each vowel produced by each listener was 32 tokens of /a:/, 9 of / :/, 21 of /i:/, and 29 of /u:/. An average of the acoustic measure (A1 P n ) of the vowels in each sentence was obtained to be used in the later statistical analysis. All the sentences were statements composed in Arabic. Sentences recorded from twelve speakers were included in the perceptual study. A total of 120 sentences were selected. For each speaker, two sentences were chosen from each category. One of these two sentences was constant across all speakers (sentences 2 , 10 , 2 0 , 2 4 , and 3 8 ) . The secon d sentence was randomly selected. One exception of this selection criteria are the sentences composed of glides and liquids s ince this category had only two sentences. As explained in the next section of the document, an anchor was used in the direct magnitude estimation rating scale task of the forth training and the perceptual experiment. This anchor sentence was selected from a speaker whose sentences were not included in neither the training nor testing stimuli of the experiment. Magnitude estimation tasks require the listener to assign a number to a sensory experience , and allow the use any positive number including decimals or fractions to report sensation magnitude (Goldstein, 2010). In some cases the observer is given a number modulus that is assigned to a specific stimulus (anchor) and asked to judge the stimuli presented relative to this anchor value. In this study the modulus value of 30 was
46 assigned to the anchor sentence. The selection of the value of the an chor is usually done towards the lower or higher ends of the scale. In some cases, the anchor stimulus is selected to represent the middle of the perceptual scale but that is not a requirement (Stevens, 197 4 ) . There are no stringent rules for choosing the magnitude of the value assigned to the anchor ( Goldstein, 2010 ) , since it is used to calibrate the listeners to the scale ( Zraick, & Liss, 2000 ). In this study, the value of the anchor was chosen as follows. Three experiences speech pathologists in resonan ce disorders have ranked all the stimuli from all speakers and the anchor sentence was chosen from the lower end of the range and was assigned the value of 30. Another set of sentences were selected for the training sessions. These sentences were selected from a set of speakers whose recordings were not included in the testing session stimuli. A total of 72 sentences were used; 30 sentences for the first training, 20 sentences for the second training, 12 sentences for the thir d training and 10 sentences for the fourth training. All the speakers recorded for this study were screened for co occurring speech problems along with hypernasality. Compensatory misarticulations were present in the speech of 10% of the speakers while 23 % exhibited week pres sure consonant production. More over , 26% of the speakers had a nasal air emission. The type of the nasal air emission (audible o r non audible) was not documented. As for voice disorders, only one speaker had a strained voice and the st imuli collected from this speaker were not included in the training or the testing stimuli.
47 Perceptual Experiment Training s essions Hypernasality is d ifficult to capture perceptually . In order to increase the reliability of the perceptual judgments, training was conducted for each listener . This training included both, a background on causes of hypernasality and practice in making perceptual judgments. The trainin g was done in two steps and duration between any two steps did not exceed three to five d ays . The first training step included an introduction session where all participants attended a presentation on hypernasality. This presentation explained the anatomical structure of the vocal tract, velopharyngeal port function and disorders, and exemplar s of resonance, articulation, and voice disorders were given. The presentation was followed by a question and answer session to clarify any remaining inquiries about the subject. After the introduction session, the second step was administered to train lis teners on making perceptual judgments of hypernasality. To enhance accuracy , this training was presented in four steps in a hierarchy from simpler to more difficult tasks. The description of each step and instructions provided to the listeners to complete each task are described below. In part 1, listeners were asked to judge the presence or absence of hypernasality in a two alternative forced choice task ( Figure 4 1) . Each of the stimuli selected for this task were presented 5 times in a random order. Th e session lasted about 20 minutes attention on detecting the presence or absence of a specific disorder, hypernasality . In part 2, the participants were asked to identify which problems were present in each of the c hoices
48 (Figure 4 2 ). Because hypernasality is seldom present as an isolated problem in the speech of this population, this part of the training session was essential to attempt to orders. The session lasted 30 minutes. Following the session, a discussion of the speech samples in question took place to provide the participant with feedback on which disorder(s) were present in the speech samples. Part 3 used a paired comparison task and aimed to establish the ability to discriminate degrees of perceived hypernasality. Listeners where presented with two sentences, and were asked to select the one that sounded more hypernasal (Figure 4 3). The testing lasted 40 minutes with three breaks , each break w as offered after 10 minutes of testing. Following the session, feedback on which sentence w as more hypernasal was provided to the listener and discussed if further clarification was needed . Part 4 aimed at introducing the participant to the D irect Magnitude Estimation (DME) with an anchor rating task to judge hypernasality . One anchor sample was selected to represent the lower end of hypernasality from the pool of sentences and was assigned a value of 30 in order to calibrate listeners to the scale . For each sentence presented for rating, the participant was asked to assign a number between 0 and 100. Where the lower end of the scale represents less hypernasality and the higher end of the scale represented sever e hypernasality (Figure 4 4). Al l stimuli for this part of the training were judged of whether it was more hypernasal or less hypernasal than the anchor stimulus. Feedback of whether the listener have rated the target samples more than 30 (i.e. the sample sounded more hypernasal than the anchor ) or less than 30 (i.e.
49 the sample sounded less hypernasal than the anchor ) in comparison to the prejudgments were provided after the training . All the feedback provided to the listeners in all parts of the training was made based on judgments of three experienced speech language pathologists from Jordan. Healthcare of Atlanta Hospital in Atlanta, Georgia . Moreover, they are all accredited speech language pathologists by Operation Smile headquarters in Norfolk, Virginia to work with individ uals with cleft lip and palate. All of them had at least 10 years of experience in this field. All training and testing sessions were performed at the University of Jordan. Liste ners were seated comfortably in a sound treated room and wore a pair of headphones. All stimuli were presented at a comfortable loudness level. Prior to testing, listeners were briefed on the nature of the experiment as well as the expectations f rom partic ipat ion . Once any questions or concerns were addressed, both the participant and researcher signed the informed consent form. Although each training session targeted a specific task, instructions across training sessions were similar. The listener was aske d visible on the GUI. Listeners were instructed not to respond until they listened to the whole sentence. To insure that no responses were made prior to listening to the complete sentence, the push buttons in training 1,2, and 3 and the insert box in training 4 were not made available until after the sentence was presented disappear ed . The requirement for each step was illustrated on the GUI. Fig ures 4 1 to 4 4 panel. For
50 sentence was being played. The insert box here was not yet activated . Follo wing to the ed again while the stimulus was played. Similar to the last step, the insert box was still not active. After the presentation of the stimulus, a new phrase show ed below which indicate d that the insert box was active and the listener can respond. For training 1, 2 and 3, the listener was asked to click on the push button to respond. On the other hand, for training 4 the listener was asked to enter a number in the box and p to the next stimul us presentation. All the training and testing sessions were conducted using custom designed graphical user interface ( GUI ) developed in MATLAB version 2012b. The equipment used for the training was a So ny V AIO E ser ies laptop, and Bose AE2 audio headphones (www.bose.com) . The MATLAB GUI allows for better administration of the intended task for both the listener and the researcher. For example, for each of the four steps of training, the presentation of the stimulus a response were conducted automatically. Moreover, the listener was only required to click a button or enter a number to complete the task. The responses f rom each training session were automatically transferred and sto red in an excel file, where each stimulus O ne excel file was generated for each listener for each of the training session s completed. Testing: Participants who passed a thresh old of 90% correct responses on the four training sessions were chosen to complete the perceptual experiment. Within one week of completion of the training sessions, the testing session
51 procedure was identical to the fo u rth part of the training session (hypernasality judgments using an anchored DME task) . Each listener was asked to rate the hypernasality of the stimuli using DME with a n anchor . Each stimulus was presented 5 times in random order. The anchor was presented at the beginning of each testing session and after each 10 sentence rating s . The participants were tested for an hour each day, with a 2 minute break presented every 10 minutes of testing in expectation of maintain ing an optimal level of attention and to mini mize fatigue . The total testing time was between 3 hours and a half to 4 hours completed over 4 days. This listening experiment was approved by University of Florida IRB02 # 2012 U 0833. Acoustic Analysis The acoustical analysis was performed using algorit hms implemented in MATLAB version 2012b. Stimuli were first edited using a tool called Label Trainer, developed by Mark Skowronski , Ph . D . This tool was used to extract vowels from these sentences manually . A screen shot of this tool is show n in Figure 4 5 . T he tool gives the option of viewing the audio signal in the time domain as well as its spectrogram. The tool also allows for listening to selected portions of the signal to confirm the location of the vowel. S egmentation of mid vowel sections was done m anually by examination of the spectrogram and the waveform as well as listening to the selected segments. For each of the extracted vowels, MATLAB version 2012b was used again to obtain the values needed to calculate the measure under investigation (A1 P n ) . A Fast Fourier Transform (FFT) analysis was performed and compared to the Linear Predictive Coding (LPC) envelop. Both the FFT and LPC were plotted on the same figure to allow easier selection of the desired values. The plot had frequency in Hz as the x axis and amplitude in dB as the y axis (Figure 4 6). The MATLAB GUI allows the selection of
52 several vector points, storing their (x,y) values in a matrix to be used in later analysis. The values that were obtained include d first formant frequency (F1) , am plitude (A1) and bandwidth (B1) , second formant frequency (F2) , amplitude (A2) and bandwidth (B2) . Additionally, by following the spectral change in time it was possible to locate the frequency and amplitude of the extra resonance peak (F n ). It is importan t to mention that the frequencies of the first and second formants were measured on the highest harmonic in that vicinity. Similarly, the frequency of the extra peak was measured on the highest harmonic in a certain range of frequency for the expected extr a peak location for each vowel as reported by Chen 1995. Figure 4 7, shows an example of the harmonics se lected for each of F1, F2, and F n . After obtaining these values, they were entered into a correction formula ( Equation 4 1 and Equation 4 2 ) to a cquire the normalized values of the extra peak accounting for the effect that the first and second formants have on the amplitude of the extra peak while adding an extra pole by recalculating the pole frequency taking into consideration the formant frequen cies of each vowel type . The measure in question , A1 P n , was calculated using these obtained values. The effect of the first formant on the e xtra peak due to nasal coupling (T1) was calculated using equation 4 1, where F1 is the frequency of the first fo rmant in Hz, B1 is the bandwidth of the first formant in dB, and F n is the frequency of the extra peak in Hz. Similarly, the effect of the second formant component on the extra peak (T2) was calculated using equation 4 2, where F2 is the frequency of the s econd formant in Hz, and B2 is the bandwidth of the second formant in dB. The values of T1 and T2 were
53 then converted to dB . By replacing P n with (P n T1 dB T2 dB ), the parameter A1 P n was normalized according to the various locations of the formants. (4 1) (4 2) Challenges in obtaining A1 P n : As explained earlier, hypernasality introduces extra nasal peaks to the spectrum as a result of an added pole zero pair(s). Since the nasal pole could affect more than one harmonic (peak) at the same time, the effect of oral nasal coupling could enhance mo re than one harmonic in different locations (Figure 4 8 ). In some cases this presented a challenge while obtaining the values needed to calculate A1 P n . This issue was present in seven instances of the vowel /i:/, and three instances of the vowel /a:/. For these vowels, t he magnitude of each peak was estimated by using the amplitude of the highest harmonic in the vicinity of the peak. A similar challenge is the fact that the extra nasal peak could occur at or affect a harmonic under the skirts of the first or second formants. This translates to a merge r of the formant and the extra peak. This was observed as a widening of the F1 or F2 peak. In turn, this merge r of the extra peak and the formant, makes it harder to pick a harmonic for both the nasal peak and the formant, since there is no prominent extra peak in this case. Figures 4 9 and 4 1 0 are examples of this phenomenon. This case
54 was apparent in one instance of the vowel /i:/, one instance of the vowel / instances of the vowel /u:/. For th ese cases, the second harmonic after the first harmonic peak was chosen as the location of the extra peak for these vowels. A similar problem is when the extra peak is not obvious due to mild hypernasality (Figure 4 1 1 ). This was the case for five instanc es of the vowel /i:/, and three instances of the vowel / /. For this reason, the selection of the harmonic closest to extra nasal peak presented an extra challenge when obtaining the values needed to calculate A1 P n . For the purposes of this study, althou gh the exact location of the extra nasal peak was sometimes not prominent and sometimes not constant, the harmonic was selected according to the reported frequency location of the nasal peak for different vowels in the literature (Table 4 1). One observat ion that was made during the acoustic analysis of the stimuli is that the effect of the poles on the formants varies with different speakers and different contexts. In some cases, the extra peak pulls the formant down and in other cases it pushes it to a higher frequency. Figure 4 1 2 shows an example of a downward shift of the second formant for the vowel /i/. Statistical Analyses All s tatistical analyses were conducted using SPSS version 21 (IBM Manufacturing) . Inter judge reliability was defined as the degree of consistency between average ratings of hypernasality severity between listeners for all the stimuli. Intra judge reliability was defined as the degree of consistency within listeners between the five trials of hypernasality severity rating s for a ll the stimuli. A b ackward stepwise regression was conducted to test the effect of several variables on the perception of hypernasality. This analysis was chosen because it starts
55 with all candidate variables, testing the effect of deleting each variable u sing a chosen model comparison criterion ( Iversen, & Gergen, 1997) . This helps in test ing what variables contribute to the overall usefulness of the model. The process of deleting variables is repeated until no further improvement is possible. Categorical independent variables could be directly entered as predictors in regression models ( Allen, 1997 ) . I t is possible to include categorical independent variables in a regression analysis in the form of dummy variables ( Allen, 1997 ). The only thing that needs t o be taken into consideration is that when entered as predictors, interpretation of regression weights depends upon how the variable is coded. Therefore, for the purposes of this study, categorical independent variables were accept able to be used in the re gression analysis. In this study the construction of the backward stepwise regression model was as follows. The model was constructed in two blocks. All categorical independent variables (speaker group, vowel type, sentence type) were entered in the first block. The reason for that is that we want to make sure that all of these variables are taken into account before we test the effect of the main predictor (A1 P n ) on perceptual ratings of hypernasality. The second block in creating the model included the a coustic measure as a predictor (independent variable). By performing the two block design, it will be possible to test the effect of the acoustic measure A1 P n on the dependent variable (perceptual ratings of hypernasality) after controlling for the other variables. The coding for the categorical independent variables was performed by assigning a number for each class of each variable. For speaker group, a child speaker was coded as 1, and adult female was coded as 2 and an adult male was coded as 3.
56 Simil arly, for vowel type, low vowels were coded as 1 and high vowels were coded as 2. Finally, sentences types were coded as 1, 2, 3, 4 and 5 to represent sentences composed mostly of stops, fricatives and affricates, glides and liquids, nasals, and a mix of o ral and nasal sounds respectively. To further investigate the effect of different vowels on the perception of hypernasality, a simple linear regression was performed with A1 P n values of each vowel (as an independent variable) and the perceptual ratings of hypernasality (as the dependent variable). The results of this analysis will be discussed in the next section. Results The average A1 P n values for each vowel for each speaker group are shown in Table 4 2 a long with the average A1 P n value across all speaker groups for each vowel . Further statistical analysis of A1 P n for each speaker group was not conducted because the speakers of this study included one male adult subject. Statistical Findings Intra judge and inter judge reliability measures were analyzed for each listener, for each sentence . The judge reliability across the five trials was 0.78 (standard deviation: 0. 1 0 , range: 0. 36 0 . 86 ). The five trials from each listener were averaged for each stimulus, which provided a reduction of the data to one average response from each listener for each of the stimuli. A matrix of concordant or the 10 raters produced significant inter judge correlations (p < 0 .01), with the exception of listeners 6 and 9 (p > 0 . 01). The first and last trials were averaged across listeners for each stimulus in order to analyze the effect of familiarity of the stimulus . T he average absolute difference between the mean response across listeners for each stimulus from Trial 1 and Trial 5
57 was 4.72 with a standard deviation of 3.05 and ranging from a minimum absolute difference of zero and a maximum absolute difference of 14.4 ( Figure 4 13 ) . A verage f trials 1 and 5 was r=0. 89 ( p<0.01 ) . Figure 4 14 details how each listener used the scale . The box and whiskers plot shows the 25 percentile and 75 percentile DME ratings for each listener r epresented by the lower and top edges for each box, respectively. The whiskers show the most extreme largest and smallest data points within the critical interval . T he critical interval length is defined as 1.5 times the distance between the 25/75 percentiles, extending upwards from the 75 percentile line and downwards from the 25 percentile line . The red line in the middle of the box represents the median and the plus signs represent the outliers. N one of the listeners used the entire 0 100 range . Listener 5 mostly used the mid range of the rating scale with a mean rating of 43 .1. Most listeners have used a wider range of the scale when compared to subject 5. Nonetheless, the majority sustained a mid range mean. All stimuli were coded by sentence type (stops, fricatives and affricates, glides and liquids, nasals, and a mix of nasal and oral consonants), v owel type (low and high), and speaker group (child , adult female, and adult male) . Those variables were used in a backward stepwise multiple regression analysis to predict perceptual judgments using the acoustic measure A1 P n . The correlations of the variables are shown in Table 4 3 . As can be seen, all correlations except for the one between DME raw ratings and sentence type were statistically significant.
58 The prediction model contained three of the four predictors and was reached in two steps with only one variable removed (sentence type). The model was statistically significant, [ F(3, 109) = 74.422, p < .001 ] , and accounted for approximately 67% of the variance of DME ratings (R 2 = 0 .672, Adjusted R 2 = 0 .663). Hypernasality ratings were primarily predicted by the acoustic measure A1 P n , and to a lesser extent by speaker g roup and vowel type. The raw and standardized regression coefficients of the predictors together with their correlations with hypernasality ratings, and their squared semi partial correlations are shown in Table 4 4 . A1 P n received the strongest weight in the model followed by group and vowel type with lower weights respectively. The regression equation based on the coefficient s of the regression for the entire data set is: y= 50.7 + 1.936 A1 P n + 2.366 Group + 5.472 Vowel type (4 3) The linear regression analyses performed for each vowel revealed significant negative correlations between the A1 P n values and the perceptual ratings of hypernasality for all vowels except the vowel /u/. The correlations of the A1 P n value with perceptual ratings for each vowel are reported in T able 4 5 . The regression models for all of the vowels except /u/ predict the perceptual ratings of hypernasality significantly well. Table 4 5 shows t he significance level for each regression model along with the model equation for each regres sion analysis performed for each vowel. D iscussion The aim of this study was to determine if the proposed measure A1 P n could explain the variability of hypernasality for the cleft palate population. Results indicate that A1 P n has a high negative correla tion with perceptual judgments of hypernasality. This suggests the usefulness of A1 P n in explaining hypernasality perceptually.
59 The results of the current study indicated that all vowels had significant correlations between A1 P n and perceptual judgments except for the vowel /u/. This result is consistent with the findings of Maeda (1982a) where the researcher investigated vowel nasalization by means of simulation of the vocal tract. The researcher synthesized different vowels with different degrees of nas al coupling to examine the perceived strength of nasalization. It was noted that the agreement between perceived nasalization and vowel spectra is more complicated with /u : / than with /i : / and / /. Although in his study, Maeda was looking at the distance b etween the first formant frequency and the frequency of the nasal peak, as opposed to the difference in amplitude of the two peaks, his results agree with the findings of the current study. There was a better correlation of the acoustic measure under inves tigation and nasalization perceptual judgments for all the vowels he examined except for the vowel /u : /. From his results, he reported that for the vowel /u/, perceptual judgments correlate with the acoustic measure in the case of a large oral nasal coupli ng. This suggests that effect of oral nasal coupling has a different effect on different vowels. Although /i : / and /u : / are high vowels, the backness of the vowels needs to be examined more closely in relation to the acoustic consequences of nasalization. One observation made while analyzing the stimuli of the current research, is that the process of obtaining the values needed to calculate the acoustic measure A1 P n was more challenging for the vowel /u : /. The vowel /u : / has a relatively lower F1 and F2. And the backness of the vowel influences the distance between F1 and F2 to be smaller when compared to the other high vowel /i : /. Since the nasalization consequence on the
60 vowel spectrum introduces an extra peak between F1 and F2, for vowel /u : / a merger o f the formant and the peak was noticed. As discussed in an earlier chapter, this has presented a challenge in locating the extra peak. This may have contributed to the low correlation between A1 P n and the perceptual judgments of hypernasality for /u : /. The correlation between A1 P n and perceptual judgments of hypernasality was significant for /i : /, / a: /, and / /. However, the correlation for the high vowel /i : / was higher than the other two low vowels /a/ and / a: /. This result is in line with the findings of Chen ( 1995 ). She reported that the significant correlation between the acoustic measure and perceptual judgments was lower for these two vowels as compared to the other vowel s she investigated in her study . This could be explained by the fact t hat low vowels need a bigger coupling than high vowels for a given nasality. Low vowels are perceived as nasal m ore often than high vowels (Ali , Gallagher, Goldstein, & Daniloff, 1971; Lintz and Sherman, 1961). This has been observed in studies that utiliz ed natural speech. Moreover , studies utilizing synthetic stimuli suggest that a larger coupling of the oral and nasal cavities is needed for low vowels to be perceived as nasal as compared to high vowels (Abramson , Nye, Henderson, & Marshall, 1981; House a nd Stevens, 1956; & Maeda, 1982c). This could be due to the more closed oral cavity posture of high vowels as compared to low vowels. Hence, a smaller coupling of the oral and the nasal cavities is enough to mak e the vowel sound nasalized in the case of hi gh vowels . The current study shows promising results of the acoustic measure A1 P n in predicting hypernasality judgments perceptually. There was a high correlation for three of the four vowels examined in this study. Since the formant pattern is a main ind icator for vowel perception, stimulating a harmonic around any of the first three formants will
61 have an influence on vowel identity. Although A1 P n for the vowel /u : correlation with perceptual ratings of hypernasality, n ormalizing the va lues obtained for the extra peak according to the formula presented by Chen (1997) (Equations 4 1 and 4 2) was useful in account ing for vowel type and for the effect of the F1 and F2 on the peak by recalculating the pole frequency taking into consideration the formant frequencies of each vowel type. For some vowels, obtaining the values to calculate A1 P n was challenging. More specifically, in the case of mild hypernasality, it was difficult to locate the extra peak. In these cases, the spectral values obtained to calculate A1 P n w ere based on the reported extra peak location of studies conducted using English stimuli . The stimuli of the current study were composed in Standard Arabic. However, f ormant frequency, amplitude, and bandwidth of Arabic vowels show a similar trend to the formants of mutual English vowels (Alghamedi, 1998). Chen (199 5 ) studied the acoustic consequences of nasalized vowels in English and French. More specifically, the investigator reported comparable values of the acoustic measure A1 P0/A1 P1 when comparing oral vowels to their nasal counterparts. Language independence is a factor to be considered in this case. Identification of oral and nasal vowels is similar across languages; however, the discrimination of these vowels shows lan guage specific differences. There is an interaction between judgments and the phonological and phonetic nature of the stimuli presented to listeners of different native languages. These anguage ( Hoffman, Krakow, 1993 ). English and Arabic vowels. The shared vowels between the two languages are all produced orally in the
62 context of oral consonants. So in conclusion, since the formant pattern is similar f or these vowels in English and Arabic, and since the same acoustic consequences of vowel nasalization are observed on vowels of different languages, it is safe to assume that the location of the extra peak reported in the literature on English language can be used for similar studies in Arabic. Excluding the sentence type from the regression model for not having a significant correlation with the dependent variable, hypernasality perceptual ratings, is in agreement with previous research by Watterson and colleagues ( 1998 ). They showed that there was no significant difference between low pressure and high pressure stimuli and nasality ratings. However, their study did not explain whether the stimuli in both categories were produced by the same speaker or no t. Other studies have suggested different conclusions in this regard. Perceived vowel nasality could be influenced by the phonetic context in which the vowels occur. Lintz and Sherman (1961) showed that nasality was perceived less severe for syllables comp osed with a stop sounds than for syllables composed of fricatives. Moreover, Kawasaki (1986) reported that perceived nasality increased when adjacent nasal consonants were attenuated. The current study examined the acoustic consequences of vowel nasaliza tion using natural sentence stimuli spoken by individuals with cleft palate. None of the previous research conducted in this area have utilized similar stimuli. In a study conducted by Chen (1997) to test the effectiveness of th e acoustic measure A1 P0/A1 P1 in assessing vowel nasalization, she investigat ed vowels in word stimuli of English examining minimal pairs composed of vowels between an oral consonant /b/ and same vowels between the nasal consonant /m/. The study also examined syllables of French
63 whe re the vowel is phonemically nasalized by comparing the more nasalized portions of study shows an agreement with the results of the current experiment in the useful ness of A1 P n in explaining the variability in hypernasality. As described earlier in this document and as noted by previous studies exploring this acoustic measure, the value of A1 P n increases as the perceived hypernasality decreases. Although bot h the have reached similar results by exploring the same acoustic measure , it is important to note that t he length of the stimuli affects the reliability of hypernasality judgments. For speech stimuli reflecting hypernasalit y, reliability of ratings increases with longer stimuli. Reliable perceptual ratings of hypernasality can be obtained using stimuli as short as a six syllable sentence . However, shorter stimuli such as isolated vowels or two syllable word s will not achieve high criterion validity and could not be used to obt ain valid estimates of hypernasality (Watterson, Lewis , & Foley Homan, 1999 ). A previous study examining acoustic correlates of hypernasality show that their acoustic measure, one third octave analysis, at isolated vowels (/i:/, / a: /) as opposed to examining the same vowels in CVC words ( Kataoka, Michi, Okabe, Mirura, & Yoshida, 1996 ). Since the reliability of the perceptual j udgments of hypernasality increase with longer stimuli of this study has utilized sentence stimuli. However, challenges in obtaining this type of longer stimuli have risen . One challenge that was faced while collecting sentences for this study was the fact that younger subjects had difficulty producing the whole stimuli and sometimes were noncompliant . Similar incidents have
64 been reported by other researchers ( Watterson, Lewis , & Foley Homan, 1999). Another complication induced by the longer stimulus, is th e additional speech and voice problems that were present in the speech of the speakers. When we assess vowels, the interaction of the acoustic signal and the vocal tract is somewhat minimal, as the amount of constriction created by the articulators is smal l. An unrepaired c left palate hinders speech to be more difficult to judge only in terms of hypernasality. The compensatory misarticulations and the nasal air emission could influence the isolation (Bzoc h, 2004). Articulatory precision affects the perception of hypernasality, in particular for untrained listeners. Therefore, a occurring speech problems with hypernasality was administered for ea ch listener to comply with the reported findings in the literature that reliability of perceptual judgments of hypernasality increases with listener experience (Lewis, Watterson, & Houghton, 2003). In addition to the co occurring speech problems in the sp eech of this population, another challenge arises from the difficulty in captu ring hypernasality perceptually. To address this problem, a training protocol for the listeners to practice making perceptual judgments was administered. As compared to the resul ts of the study in Chapter 3, the training protocol could have contributed to higher reliability in the perceptual ratings of hypernasality. None of the previous studies conducted to investigate vowel nasalization acoustically examined vowel nasalization introduced by a functional or structural cause of hypernasality. In this study sentences produced by individuals with cleft palate were utilized. Other studies utilized minimal pairs of English vowels in CVC words with their
65 nasal counterparts (Chen, 1995) . Another study conducted by the same researcher examined vowel nasalization in phonemically nasalized vowels in French. Another attempt was conducted by Lee et al. ( 2006 ) to test the effectiveness of the VHLR (discussed in chapter 3) in detecting hypernas ality by studying the sustained vowel /a:/ and a simulated nasalized /a:/ produced by normal speakers. Although their studies show that as nasalization of the vowel increases the VLHR index increases, perceptual studies using this measure showed no signifi cant difference between hypernasal and normal speech (Vogel , Ibrahim, Reilly, & Kilpatrick, 2009 ). Pruthi (2007) suggested that the same acoustic parameters can be used to capture both phonemic and phonetic nasalization. One difference between phonemi c and phonetic nasalization is that the duration of nasalization is much longer when vowels are phonemically nasalized as compared to vowels that are phonetically nasalized. This fact is supported by the work of Delattre and Monnot (196 8 ). Phonemically nas alized vowels are more nasalized than vowels occurring adjacent to nasal consonants. This is due to the longer lowering of the velum for the phonemically nasalized vowels when compared to phonetically nasalized vowels ( Pruthi, 2007 ). Although the oral nasa l coupling acoustic effects are similar in both types, there is a difference in the amount of spectral change between the two types (Chen , 1995). It is also observed that even in the case of phonemic vowel nasalization, the onset of the vowel is less nasal ized than the mid and end portion of the vowel (Chen , 1997). The acoustic characteristics of nasalization are more strongly expressed in terms of degree and duration for phonemically nasalized vowels (Pruthi, 2007). In the case of vowel nasalization stemm ing from functional or structural causes, the oral nasal coupling is
66 almost always constant throughout the production of the whole speech sample. While in the case of phonemic and phonetic vowel nasalization the velum lowering occurs for a portion of the v owel. Although the view expressed by Dickson (1962) and Pruthi (2007) suggests that the acoustic cues are the same irrespect ive of the category of nasality , it is important to take into consideration that when investigating vowel nasalization, the cause of nasalization may introduce some differences in the magnitude of the spectral change of the vowel. Figure 4 14 details how each listener used the scale. None of the listeners used the entire 0 100 range of the scale. Listener 5 mostly used the mid range o f the rating scale with a mean rating of 43.1 and a range of 30 to 52.8. This narrow range suggests that listener 5 rated the stimuli with a limited degree of hypernasality. Most listeners have used a wider range of the scale when compared to subject 5. No netheless, the majority sustained a mid range mean. In spite of the moderate magnitude of the inter rater reliability in the study, the correlation between the ratings of all listeners was significant except for listeners 6 and 9. This indicates that liste ners have rated the stimuli in a similar ranking regardless of the value they assigned for each stimulus on the DME scale. To take a closer look on the way the listeners have rated the stimuli on the five times the sentences was presented, trials 1 and 5 for each stimulus were averaged for all listeners and the correlation between the two trials was calculated. The high correl ation coefficient (r=0.89) indicates that there was no learning effect for rating the stimuli. The average absolute difference between the mean response across listeners for each stimulus from t rial 1 and t rial 5 was as low as 4.72.
67 Studies conducted to qu antify hypernasality acoustically have been limited. Both correlations with perceptual judgments of vowel nasalization. Studies conducted to detect vowel nasalization using VLHR show that as nasalization of the vowel incr eases the VLHR index increases. However, perceptual studies using this measure showed no significant difference between hypernasal and normal speech ( Vogel , Ibrahim, Reilly, & Kilpatrick, 2009 ). On the other hand, o ne third octave analysis achieved a modest level of agreement with the perceptual assessments ( Vogel , Ibrahim, Reilly, & Kilpatrick, 2009 ), nonetheless, A1 P n achieved higher correlation with perc eptual judgments of hypernasality in the current study . One third octave analysis provides comparisons of spectral profiles of speech of individuals with hypernasality and normal controls. As opposed to A1 P n be correlated to perceptual judgments. The use of this method requires the comparison of two different speech samples by examining the bands in their spectrum. Studies conducted using hypernasality degrees when looking at isolated vowels (/i :/, /a: /) (Kataoka, Michi, Okabe, Mirura, & Yoshida, 1996) . A1 P n succeeded in explaining the variability of hypernasality across three of the four examined vowels in this study ( / a:/, / :/, and /i:/ ). The studies conducted to investigate the acoustic consequences of hypernasality mainly examined changes to the power spectrum in low frequencies. Similarly the current research assessed the effect of hypernasality in the lower frequencies. A1 P n , spect ral flatness and F1 prominence are in general vowel independent acoustic cues of explain 100% of the variance in the perceptual
68 judgments of hypernasality. A1 P n accounted for 58% of the variance in perceptual judgments af ter controlling for the other variables. Moreover, A1 P n degrees of hypernasality for the high back vowel /u :/ . The acoustic effects of the nasal coupling are vowel dependent with respect to the judgment of the presence and degree of nasalization. Observing changes at higher frequency ranges and including them in the assessment tool might lead to a better measure th at is vowel independent and accounts for more of the variance in hypernasality. It might aid in finding a common factor of nasalization on different vowels that can prove a better cue that signals vowel nasalization. Figure 4 1. GUI for training 1.
69 Figure 4 2 . GUI for training 2. Figure 4 3 . GUI for training 3.
70 Figure 4 4 . GUI for training 4. Figure 4 5 . Label Trainer GUI.
71 Figure 4 6. FFT power spectrum and LPC envelop. Figure 4 7. An example of the harmonics picked for F1, F n , and F2.
72 Figure 4 8. An example of multiple harmonics affected by the extra pole zero pair(s) (indicated by the two black arrows). Figure 4 9. A spectral envelop showing a blend of F1 and F n .
73 Figure 4 10. A spectral envelop showing a blend of F1, F n , and F2. Figure 4 11. An example of a spectral envelop where the extra peak could not be identified clearly.
74 Table 4 1. Average frequencies of the nasal peaks as reported by Chen (1997). The P0 and F P 1 are the frequencies of the nasals peaks. Vowel F P0 (Hz) F P1 (Hz) /i/ 964 /u/ 206 1032 /Ã¦ / 212 924 / / 217 953 Figure 4 12. An example of formant location shift caused by the introduction of an extra pole zero pair due to hypernasality.
75 Table 4 2. Average A1 P n values in dB for each vowel for each speaker group. Vowel Child Adult Female Adult Male Average across Speaker groups a: 4.34 4.92 7.38 5.55 : 3.62 3.70 3.04 3.45 i: 7.93 2.28 9.55 6.58 u: 8.11 6.32 13.31 7.22 Figure 4 13. Raw DME Intelligibility judgments of Trial 1 and Trial 5 , ordered from least to greatest score on Trial 1 . (r=0.89, p<0.01)
76 Figure 4 14 . and 75 percentile (upper edg e of box) hypernasality ratings for each listener. Table 4 3. Correlations of the Variables in the Regression Analysis. Variable A1 P n Group Vowel type Sentence Type Hypernasality Ratings 0.765* 0.266* 0.172** 0.058 A1 P n -0.142 0.116 0.077 Group -0.086 0.087 Vowel type -0.01 Sentence Type -* p<.0 1 . * * p<.05.
77 Table 4 4. Backward Stepwise Regression Results. Model b SE b Beta Pearson r Sr 2 Constant 50.705 2.165 A1 P n * 1.936 0.14 0.775 0.765 0.58 Group 2.366 0.982 0.134 0.266 0.02 Vowel type* 5.472 1.214 0.25 0.172 0.06 Note: The dependent variable was Hypernasality ratings, R2= 0.672, Adjusted R2= 0.663. Sr 2 is the squared semi partial correlation. * p<.05. Table 4 5. Linear Regression Results for Each Vowel. Vowel Correlation Model significance Model Equation /a : / 0.550 ** 0.000 y=56.931 (1.291)A1 P n / / 0.747** 0.002 y=56.321 (2.159)A1 P n /i : / 0.890** 0.000 y=67.435 (2.181)A1 P n /u : / 0.257 0.743 y=66.36 (1.549)A1 P n ** Correlation is significant at the 0.01 level (2 tailed)
78 CHAPTER 5 CONCLUSIONS Summary of Findings The work described in this dissertation focused on assessing the validity of the acoustic measure A1 P n in predicting perceptual judgments of hypernasality. The difference between the amplitude of the first formant and the amplitude of an extra peak (A1 P n ) is shown to be a promising measure of hypernasality. The acoustic measure A1 P n accounted for 76% of the variance in perceptual judgments of hypernasality. The significant correlation between A1 P n was evident for three of the four vowels examined in this study (a:/, / ). Although a correction formula was used to account for the effect of the first and second formants on the extra peak, A1 P n was not able to explain the variability in nasality judgments for the vowel /u : /. The study in chapter 3 suggests that the extra pole location can influence the perception of nasalit an effect on the perception of hypernasality. However, the range of the intensities of the extra peak should be expanded to include lower and higher variations. Since A1 P n has shown effec tiveness in detecting hypernasality, the magnitude of change in the difference between A1 and P n should be more closely examined. This research focused on changes induced by hypernasality on the vowel spectrum by examining the lower frequency ranges. Simil ar extra peaks have been reported to manifest in higher frequencies ( Hoffman, Krakow, 1993 ) . Those changes in the high frequency regions should be explored.
79 General Limitations Limitations of this research lie in the reflected range of hypernasality stim uli presented in both studies. A wider range of the degrees of hypernasality is essential for investigating the acoustic properties of this resonant quality. One other limitation related to the stimuli collected is that in the two missions targeted, there was only one adult male who reflected hypernasality among all the speakers included in this study. To better explore the effect of gender induced characteristics on the acoustic measure of hypernasality, a comprehensive sample of male and female recordings is warranted. The unique thing about hypernasality is that the changes it induces on the power spectrum could not always be detected. For example, an extra pole zero pair could translate to an extra peak that can be observed on the spectral envelop. How ever, location of zeros that occur is hard to be identified. I t is not usually apparent on the spectrum where that zero have landed and what ranges of frequencies it affected. Since the stimuli of the study in chapter four has been collected by targeting medical missions that lasts for a week each time, the recordings collected depended highly on the number and nature of patients showing up in each mission. A wider range of hypernasality and a more comprehensive sample set could be obtained by establishing a wide database of recordings of cleft palate speech in general and hypernasal speech in particular. Future Directions The long term goal of this research is to develop a model to predict the degree of hyper nasality in speech . This will have a direct imp act on creating better tools for assessment and rehabilitation of patients with velopharyngeal inadequacy.
80 Although using natural speech to detect hypernasality is important, a synthesized stimulus approach helps in controlling the degree of change in the acoustic cue in question. So with the intention of scaling hypernasality, synthesized speech s amples might serve this purpose while taking into consideration the appropriate length of stimuli needed for this purpose. For future research, synthesis of a bro ader range of stimuli might have a closer estimation of nasal voices. In specific, manipulating acoustic features such as fundamental frequency and intensity should be more investigated. The investigation of A1 P n to predict velopharyngeal coupling size would be useful in a clinical setting. Moreover, this measure may be useful in clinical applications which require quantification of hypernasality for patients with velopharyngeal insufficiency and cleft palate. An other future task is to automate the measurement of this acoustic measure. This requires consistent detection of the values used to calculate this measure. A formant tracker is needed to detect the formants. T he extra peak needs to be detected around 950 H z for high vowels and around the highest harmonic in low frequencies for low vowels. The investigation of more than one peak for the same vowel would improve the validity of this measure. T he collective effect of all the extra peaks induced should be furt her investigated in an attempt to account for all the changes in the power spectrum for the nasalized vowel. Moreover, as shown from previous research, as the distance between the introduced pole and zero increases , the more prominent the peak is (Chen, 19 95). This in turn might excite more than one harmonic. The change in all the excited harmonics due to the extra peak introduced should be further investigated as well . One
81 proposed suggestion to account for this could be the study of the distribution of en ergy in frequency bands as opposed to looking at single harmonics. Research in the field of hypernasality has evolved from qualitative to quantitative measures. Nonetheless, a good acoustic correlate of hypernasality is still in early stages. Previous wor k as well as this work investigated a single acoustic parameter of hypernasality. More analysis is needed to find a set of measures that can collectively predict perceptual judgments of hypernasality.
82 APPENDIX LIST OF SENTENCES AND THEIR PHONETIC TRANSCR IPTION. Sentence No. Sentence Phonetic Transcription Stops 1 / ba:da fat 2 3 /q d 4 tt a bta:kul/ 5 6 i 7 /du:du: biddu katku:tu:/ 8 / b t 9 tt Fricatives and Affricates 10 11 12 afi:f k i:r/ 13 14 /sa:ra fida: fissa: ah/ 15 /fa si 16 / i Ã° 17 aqi:l/ 18 /suru:r Ã° 19 s fu:r mina tt uju:r/ Liquids and Glides 20 /hajja: hajja ja: ja ja/ 21 /hajjil ju:ju: lu:lu:/ Nasal Consonants 22 23 t 24 25 26 27 /salma: na:mat/ 28 mar/ 29 Mix of Nasal and Oral Consonants 30 lilmusi:qa/ 31 32 33 /rasamat 34 35 / ajawa:n/ 36 Ã° i: Ã° / 37 38
83 LIST OF REFERENCES Abramson, A., Nye, P., Henderson, J., Marshall, C., (1981). Vowel height and the perception of consonantal nasality. J Journal of Acoustical Society of America , 70 (2), 329 339. Alghamdi, M., (1998). A spectrographic analysis of Arabic vowels: A cross dial ect study. Journal of King Saud University , 10 (1), 3 24. Ali, L., Gallagher, T., Goldstein, J., Daniloff, R., (1971). Perception of coarticulated nasality. Journal of Acoustical Society of America , 49 (2), 538 540. Allen, M. (1997). Understanding regressi on analysis . New York: Springer Science and Business Media. Bae, Y., Kuehn, D. P., & Ha, S. (2007). Validity of the Nasometer measuring the temporal characteristics of nasalization. Cleft Palate Craniofacial Journal , 44(5) Bognar, E., & Fujisaki, H., 198 6. Analysis, synthesis and perception of French nasal vowels. In: Proceedings of ICASSP . pp. 1601 1604. Bradford, L., Brooks, A., & Shelton, R. (1964). Clinical judgment of hypernasality in cleft palate children. Cleft palate Journal , 6, 32 335 Bzoch , K. (1979). Measurement and assessment of categorical aspects of cleft palate speech. In: Bzoch, K., (Ed.), Communication Disorders Related to Cleft Lip and Palate . Boston: Little, Brown. Bzoch, K. (2004). Communicative Disorders Related to Cleft Lip and Palate . (5th Ed.). Austin,Texas: Pro Ed Inc. Cairns, D., Hansen, J., & Riski, J. (1996). A noninvasive technique for detecting hypernasal speech using a nonlinear operator. IEEE Transactions on Biomedical Engineering , 43(1):35 45. Carney, P., & Sherman, D. (1971). Severity of nasality in three selected speech tasks. Journal of Speech, Language, and Hearing Research , 14:396 407 Chen, M. (1995). Acoustic parameters of nasalized vowels in hearing impaired and normal hearing speakers. Journal of the Acoustica l Society of America , 98, 2443 2453. Chen, M. (1996). Acoustic correlates of nasality in speech. (Doctoral dissertation). Retrieved from Theses and Dissertations at Massachusetts Institute of Technology Libraries Chen, M. (1997). Acoustic correlates of E nglish and French nasalized vowels. Journal of Acoustical Society of America , 102, 2360 2370.
84 Chen, N., Slifka, J., & Stevens, K. (2000). Vowel nasalization in American English: Acoustic variability due to phonetic context. In Proceedings of 16th ICPhS . Co unihan, D. & Cullinan, V. (1970). Reliability and dispersion of nasality ratings. Cleft Palate Journal , 7, 261. Dalston, R. & Warren, D. (1986). Comparison of Tonar II, pressure flow, and listener judgments of hypernasality in the assessment of velopharyng eal function. Cleft Palate Journal , 23(2). Dalston, R., Warren, D., & Dalston, E. (1991). A preliminary investigation concerning the use of nasometry in identifying patients with hyponasality and or nasal airway impairment. Journal of Speech and Hearing D isorders , 34:(1):11 18. Dang, J., Honda, K., & Suzuki, H. (1994). Morphological and acoustical analysis of the nasal and the paranasal cavities. Journal of Acoustical Society of America , 96 (4): 2088 2100. Delattre, P., Monnot, M., (1968). The role of dura tion in the identification of French nasal vowels. International Review of Applied Linguistics , 6, 267 288. Dickson, D. (1962). Acoustic study of nasality. Journal of Speech and Hearing Research , 5 (2), 103 111. Eadie, T., & Baylor, C. (2006). The effect of perceptual training on inexperienced Journal of Voice , 20, 527 544. Fant, C. (1960). Descriptive analysis of the acoustic aspects of speech. Logos , 5, 3 17. Fletcher, S. (1976). Nasalance vs. nasal Judgments of n asality. Cleft Palate Journal , 13:31 44. Fujimura, O., & Lindqvist, J. (1971). Sweep tone measurements of vocal tract characteristics. Journal of Acoustical Society of America , 49: 541 558. Gerratt, B., Kreiman, J., Antonanzas Barroso, N., & Berke, G. (199 3). Comparing internal and external standards in voice quality judgments. Journal Speech Hearing Research , 36:14 20 Goldstein, B. (2010). Encyclopedia of perception . Thousand Oaks, CA: SAGE Publications, Inc. Hattori, S., Yamamoto, K., & Fujimura, O. (1958 ). Nasalization of vowels in relation to nasals. Journal of Acoustical Society of America , 30 (4): 267 274. Havstama, C., Lohmandera, A., Perssona, C., Dotevalla, H., Lithb, A., & Lilja, J. (2005). Evaluation of VPI soendoscopy. British Journal of Plastic Surgery , 58: 922 931.
85 Hawkins, S. & Stevens, K. (1985). Acoustic and perceptual correlates of the nonnasal nasal distinction for vowels. Journal of Acoustical Society of America , 77:1560 1575. Horii, Y. (1980). An ac celerometric approach to nasality measurement: A preliminary report. Cleft Palate Journal , 17:254 261. Horii, Y., & Lang, J. (1981). Distributional analyses of an index of nasal coupling (HONC) in simulated hypernasal speech. Cleft Palate Journal , 18 (4): 279 285. Horii, Y., & Monroe, N. (1983). Auditory and visual feedback of nasalization using a modified accelerometric method. Journal of Speech Hearing Research , 26:472 475. House, A., & Stevens, K. (1956). Analog studies of the nasalization of vowels. J ournal of Speech and Hearing Disorders , 21 (2):218 232. Iversen, G., & Gergen, M. (1997). Statistics: The conceptual approach . NY: Springer Verlag New York, Inc. Johns, D., Rohrich, R., & Awada, M. (2003). Velopharyngeal incompetence: A guide for clinical evaluation. Plastic and Reconstructive Surgery , 112(7) Karim, A., Kaykobad, M., & Murshed, M. (2013). Technical Challenges and Design Issues in Bangla Language Processing , (th Ed.). Portland, OR: Book News Inc. Kataoka, R., Michi, K., Okabe, K., Mirura, T. & Yoshida, H. (1996). Spectral properties and quantitative evaluation of hypernasality in vowels. Cleft Palate Craniofacial Journal , 33, 43. Kataoka, R., Zajac, D., Mayo, R., Lutz, R., & Warren, D. (2001). The influence of acoustic and perceptual factor s on perceived hypernasality in the vowel [i]: a preliminary study. Folia Phoniatrica etLogopedica , 53, 198 212. Kawasaki, H. (1986). Phonetic explanation for phonological universals: The case of distinctive vowel nasalization. In: Ohala, J., & Jaeger, J. (Eds.), Experimental phonology . Orlando, FL: Academic Press, Inc. Kearns, K., & Simmons, N. (1988). Interobserver reliability and perceptual ratings: More than meets the ear. Journal of Speech and Hearing Research , 31: 131 136 Kent, R. (1996). Hearing and believing: some limits to the auditory perceptual assessment of speech and voice disorders. American Journal of Speech Language Pathology , 5:7 23. Kreiman, J., Gerratt, B., Kempster, G., Erman, A., & Berke, G. (1993). Perceptua l evaluation of voice quality: review, tutorial, and a framework for future research. Journal of Speech Hearing Research , 36:21 40.
86 Kuehn, D. & Moller, K. (2000). Speech and language issues in the cleft palate population: The state of the art. Cleft Palate Craniofacial Journal , 37: 1 35. Laczi, E., Sussman, J., Stathopoulos, E., & Huber, J. (2005). Perceptual evaluation of hypernasality compared to HONC measures: the role of experience. Cleft Palate Craniofacial Journal , 42(2):202 11 Ladefoged, P. (1982). A Course in Phonetics . (2nd Ed.) New York: Harcourt Brace Jovanovich, Inc. Lee, A., Ciocca, V., & Whitehill, T. (2003). Acoustic correlates of hypernasality. Clinical Linguistic Phonology , 17(4 5):259 64. Lee, A., Whitehill, T., & Ciocca, V. (2009). Effec t of listener training on perceptual judgment of hypernasality. Clinical Linguistics Phonology , 23(5):319 34. Lee, G., Wang, C., Yang, C., Kuo, T. (2006). Voice low tone to high tone ratio: a potential quantitative index for vowel [a:] and its nasalization . IEEE Transactions on Biomedical Engineering , 53(7):1437 9. Lewis, K., Watterson, T., & Blanton, A. (2008). Comparison of short term and long term variability in nasalance scores. Cleft Palate Craniofacial Journal , 45(5):495 500. Lewis, K., Watterson, T. , & Houghton, S. (2003). The influence of listener experience and academic training on ratings of nasality. Journal of Communication Disorders , 36, 49 58. Lewis, K., Watterson, T., & Quint, T. (2000). The effect of vowels on nasalance scores. Cleft Palate Craniofacial Journal , 37 (6): 584 589. Lindblom, B., Lubker, J., & Pauli, S. (1977). An acoustic perceptual method for the quantitative evaluation of hypernasality. Journal of Speech Hearing Research , 20(3):485 96. Lindblom, B., Lubker, J., & Pauli, S. (1977). An acoustic perceptual method for the quantitative evaluation of hypernasality. Journal of Speech Hearing Research , 20(3):485 96. Lintz, L., Sherman, D., (1961). Phonetic elements and perception of nasality. Journal of Speech and Hearing Research , 4, 381 396. Litzaw, L., & Dalston, R. (1992). The effect of gender upon nasalance scores among normal adult speakers. Journal of Communicative Disorders , 25(1):55 64. Maddieson, I. (1984), Patterns of sounds . New York: Cambridge University Press .
87 Maeda, S. (1993). Acoustics of vowel nasalization and articulatory shifts in French nasal vowels. In: Huffman, M., & Krakow, R., (Eds.), Phonetics and phonology. Vol. 5: Nasalization, and the velum . San Diego, CA: Academic Press, Inc. Maeda, S., (1982a). A digital simulation method of the vocal tract system. Speech Communication , 1, 199 229. Maeda, S., 1982c. Acoustic cues for vowel nasalization: A simulation study. Journal of Acoustical Society of America , 72, S102. Maeda, S., 1982c. Acoustic cues for v owel nasalization: A simulation study. Journal of Acoustical Society of America , 72 (S1), S102. McWilliams, B., Morris, H., & Shelton, R. (1990). Cleft palate speech (2nd ed.). Philadelphia: B.C. Becker. Mra, Z., Sussman, J. E., & Fenwick, J. (1998). HONC measures in 4 to 6 year old children. Horii Oral Nasal Coupling Index. Cleft Palate Craniofacial Journal , 35(5):408 14. Pruthi, T. (2007). Analysis, vocal tract modeling and automatic detection of Vowel nasalization. (Doctoral dissertation). Retrieved fro m Theses and Dissertations at University of Maryland Libraries Pruthi, T., Espy Wilson, C. & Story, B. H. (2007). Stimulation and analysis of nasalized vowels based on Magnetic Resonance Imaging data. Journal of Acoustical Society of American , 121(6). Pr uthi, T., Espy Wilson, C. & Story, B. H. (2007). Stimulation and analysis of nasalized vowels based on Magnetic Resonance Imaging data. Journal of Acoustical Society of American , 121(6). Schiavetti, N., Metz, D., & Sitler, R. (1981). Construct validity of direct magnitude estimation and interval scaling: Evidence from a study of the hearing impaired. Journal of Speech and Hearing Research , 24, 441 445. Schneider, E., & Shprintzen, R. (1980). A Survey of speech pathologists: Current trends in the diagnosis a nd management of velopharyngeal insufficiency. Cleft Palate Journal , 17(3). Seaver, E., Dalston, R., Leeper, H., & Adams, L. (1991). A Study of nasometric value for normal nasal resonance. Journal of Speech and Hearing Research , 34: 715 721. Selkirk, E. (1984). On the major class features and syllable theory. In: Aronoff, M., & Oehrle, R. (Eds), Language sound structure . Cambridge, MA: MIT Press.
88 Stevens, K., Andrade, A., & Viana, M. (1987a). Perception of vowel nasalization in VC contexts: A cross language study. Journal of Acoustical Society of America , 82 (S1), S119. Stevens, K., Fant, G., Hawkins, S. (1987b). In Honor of Ilse Lehiste. Foris Publications, Ch. Some acoustical and perceptual correlates of nasal vowels, pp. 241 254. Stevens, S. (1974). Perceptual magnitude and its measurement. In: Carterette, E. & Friedman, M. (Eds.), Handbook of perception. Vol. II: Psychophysical judgment and measurement (pp. 361 389). New York: Academic Press Sussman, J. (1995). HONC measures in men and women : validity and variability. Cleft Palate Craniofacial Journal , 32:37 48. Vijayalakshmi, P. & Ramasubba Reddy, M. (2004). Analysis of hypernasality by synthesis, Proce. of 8th International Conf. on Spoken Language Processing, (ICSLP 2004) Jaju Island, Kor ea, pp:Vol.1, 525 528. Vogel, A., Ibrahim, H., Reilly, S., & Kilpatrick, N. (2009). A comparative study of two acoustic measures of hypernasality. Journal of Speech Language and Hearing Research , 52(6):1640 51. Warren, D., & DuBois, A. (1964). A pressure flow technique for measuring velopharyngeal orifice area during speech. Cleft Palate Journal , 1:52 71. Warren, D., (1976). Aerodynamics of sound production. In:Lass, N. (Ed.), Contemporary issues in experimental phonetics . Springfield, IL: Thomas. Warren, D., (1979). PERCI: A method for training palatal efficiency. Cleft Palate Journal , 16, 279 285. Watterson, T., Lewis, K., & Deutsch, C. (1998). Nasalance and nasality in low pressure and high pressure speech. The C left P alate C raniofacial J ournal . 35 (4), 293 298. Watterson, T., Lewis, K., & Foley Homan, N. (1999). Effects of stimulus length on nasalance scores. Cleft Palate Craniofacial Journal , 36 (3): 243 247. Watterson, T., Lewis, K., Allord, M., Sulprizio, S., & O'Neill. P. (2007). Effect of vowel type on reliability of nasality ratings. Journal of Communication Disorders , 40(6):503 12. Whitehill, T., Lee, A., & Chun, J. (2002). Direct magnitude estimation and interval scaling of hypernasality. Journal of Speech, Language, and Hearing Research , 45: 80 8.
89 Zraick, R., & Liss, J. M. (2000). A Comparison of equal appearing interval scaling and direct magnitude estimation of nasal voice quality. Journal of Speech, Language, and Hearing Research , 43(4):979 988.
90 B IOGRAPHICAL SKETCH Nour El Bashiti was born in 1981, in Amman, Jordan. She earned her B.S. in s peech and h earing from Jordan University of Science and Technology in 2003. She earned her M.S in s peech l anguage p athology from The University of Jordan in 2005 . Upon graduation, Nour worked for 12 months as a speech language pathologist at the Center for Phonetics Research Clinic the University of Jordan. After that, she worked as a lecturer in the Department of Speech Language and Hearing Sciences, Rehabilitati on Sciences Faculty, at the University of Jordan. She began pursuing her doctorate at the University of Florida in August 2008. Upon completion of her doctoral studies in the Speech, Language and Hearing Sciences Department in 201 4 , she will begin an assis tant professor position at the University of Jordan, Amman, Jordan.