<%BANNER%>

SWIPE

Permanent Link: http://ufdc.ufl.edu/UFE0021589/00001

Material Information

Title: SWIPE A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music
Physical Description: 1 online resource (116 p.)
Language: english
Creator: Camacho, Arturo
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: estimation, frequency, fundamental, harmonics, music, numbers, pitch, prime, sawtooth, speech, tracking
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: A Sawtooth Waveform Inspired Pitch Estimator (SWIPE) has been developed for processing speech and music. SWIPE is shown to outperform existing algorithms on several publicly available speech/musical-instruments databases and a disordered speech database. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. A decaying cosine kernel provides an extension to older frequency-based, sieve-type estimation algorithms by providing smooth peaks with decaying amplitudes to correlate with the harmonics of the signal. An improvement on the algorithm is achieved by using only the first and prime harmonics, which significantly reduces subharmonic errors commonly found in other pitch estimation algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Arturo Camacho.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Harris, John G.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021589:00001

Permanent Link: http://ufdc.ufl.edu/UFE0021589/00001

Material Information

Title: SWIPE A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music
Physical Description: 1 online resource (116 p.)
Language: english
Creator: Camacho, Arturo
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2007

Subjects

Subjects / Keywords: estimation, frequency, fundamental, harmonics, music, numbers, pitch, prime, sawtooth, speech, tracking
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: A Sawtooth Waveform Inspired Pitch Estimator (SWIPE) has been developed for processing speech and music. SWIPE is shown to outperform existing algorithms on several publicly available speech/musical-instruments databases and a disordered speech database. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. A decaying cosine kernel provides an extension to older frequency-based, sieve-type estimation algorithms by providing smooth peaks with decaying amplitudes to correlate with the harmonics of the signal. An improvement on the algorithm is achieved by using only the first and prime harmonics, which significantly reduces subharmonic errors commonly found in other pitch estimation algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Arturo Camacho.
Thesis: Thesis (Ph.D.)--University of Florida, 2007.
Local: Adviser: Harris, John G.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2007
System ID: UFE0021589:00001

Full Text





SWIPE: A SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR
FOR SPEECH AND MUSIC




















By

ARTURO CAMACHO


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2007




































O 2007 Arturo Camacho




























Dedico esta disertaci6n a mis queridos abuelos
Hugo y Flory









ACKNOWLEDGMENTS

I thank my grandparents for all the support they have given to me during my life, my wife

for her support during the years in graduate school, and my daughter who was in my arms when

the most important ideas expressed here came to my mind. I also thank Dr. John Harris for his

guidance during my research and for always pushing me to do more, Dr. Rahul Shrivastay for his

support and for introducing me to the world of auditory system models, and Dr. Manuel

Bermudez for his unconditional support all these years.












TABLE OF CONTENTS



page

ACKNOWLEDGMENT S .............. ...............4.....


LI ST OF T ABLE S ................. ...............7................


LI ST OF FIGURE S .............. ...............8.....


LI ST OF AB BREVIAT IONS ................. ................. 11......... ....


AB S TRAC T ............._. .......... ..............._ 12...


CHAPTER


1 INTRODUCTION ................. ...............13.......... ......


1.1 Pitch Background ................. ...............14...............
1.1.1 Conceptual Definition .............. ...............14....
1.1.2 Operational Definition............... ...............15
1.1.3 Strength .................. ...............16...............
1.1.4 Duration Threshold............... .. ... ................. 19
1.2 Illustrative Examples and Pitch Determination Hypotheses .............. .....................2
1.1.2 Pure Tone............... .. ....... .... .. .... .. .. .......2
1.2.2 Sawtooth Waveform and the Largest Peak Hypothesis .............. ....................2
1.2.3 Missing Fundamental and the Components Spacing Hypothesis...........................21
1.2.4 Square Wave and the Maximum Common Divisor Hypothesis ............................22
1.2.5 Alternating Pulse Train............... ...............24.
1.2.6 Inharmonic Signals............... ...............25
1.3 Loudness ................. ... ......... ............. ...... .........2
1.4 Equivalent Rectangular Bandwidth ................ ...............27........... ...
1.5 Dissertation Organization ................ ...............29........... ...
1.6 Summary ................. ...............3.. 0.............

2 PITCH ESTIMATION ALGORITHMS: PROBLEMS AND SOLUTIONS ................... .....31


2.1 Harmonic Product Spectrum (HPS)............... ...............32.
2.2 Sub-harmonic Summation (SHS) .............. ...............34....
2.3 Subharmonic to Harmonic Ratio (SHR) ................. ...............36..............
2.4 Harmonic Sieve (HS)............... ...............37..
2.5 Autocorrelation (AC) .............. .. ............ .. .. ......... ......... ..... .......3
2.6 Average Magnitude and Squared Difference Functions (AMDF, ASDF) .......................42
2.7 Cepstrum (CEP) ................. ...............44................
2.8 Summary ................. ...............46......... ......












3 THE SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR ................ ...............47


3.1 Initial Approach: Average Peak-to-Valley Distance Measurement .............. .................47
3.2 Blurring of the Harmonics ................. ...............49........... ..
3.3 Warping of the Spectrum ................. ...............51........... ..
3.4 Weighting of the Harmonics. ................ ................. .................. ..........53
3.5 Number of Harmonics ................. ...............55..............
3.6 Warping of the Frequency Scale ................. ...............55......___ ..
3.7 Window Type and Size............... ...............57..
3.8 SW IPE .............. ...............63....
3.9 SW IPE' ......... .. .. ............... .... .................6
3.9. 1 Pitch Strength of a Sawtooth Waveform ................. ...............69 .......... .
3.10 Reducing Computational Cost .................. ......... ...............71....
3.10.1 Reducing the Number of Fourier Transforms .............. ...............71....
3.10.1.1 Reducing window overlap................ ...............72
3.12. 1.2 Using only power-of-two window sizes. ................ ......... ...............74
3.10.2 Reducing the Number of Spectral Integral Transforms .............. ...................81
3.11 Summary ................. ...............86........... ...


4 EVALUATION .............. ...............87....


4.1 Al gorithm s .............. ...............87....
4.2 Databases .............. ...............8 8....
4.3 M ethodology ................. ...............89........... ....
4.4 Results............... ...............89
4.5 Discussion............... ...............9


5 CONCLU SION................ ..............9


APPENDIX


A MATLAB IMPLEMENTATION OF SWIPE' ................ ...............99...............


B DETAILS OF THE EVALUATION ................. ...............102........... ...


B.1 D databases .............. ... .......... .............10
B. 1.1 Paul Bagshaw' sDatabase ................ ...............102.............
B. 1.2 Keele Pitch Database ................. ...............102.......... ..
B.1.3 Disordered Voice Database .............. ...............103....
B. 1.4 Musical Instruments Database ................. ...............104........... ..
B.2 Evaluation Using Speech ................ .....___ ...............105 ....
B.3 Evaluation Using Musical Instruments .................._.._.. ......... ............0


C GROUND TRUTH PITCH FOR THE DISORDERED VOICE DATABASE ................... 110


REFERENCES ................. ...............112....... ......


BIOGRAPHICAL SKETCH ........._..... ...............116....... ......











LIST OF TABLES

Table page

3-1 Common windows used in signal processing ................. ...............62........... ..

4-1 Gross error rates for speech .............. ...............90....

4-2 Proportion of overestimation errors relative to total gross errors ........._.. .....................90

4-3 Gross error rates by gender ........._. ........_. ...............91.....

4-4 Gross error rates for musical instruments .............. ...............92....

4-5 Gross error rates by instrument family .............. ...............92....

4-6 Gross error rates for musical instruments by octave ................. ................ ......... .93

4-7 Gross error rates for musical instruments by dynamic .............. ...............94....

4-8 Gross error rates for variations of SWIPE' ............. ...............95.....

C-1 Ground truth pitch values for the disordered voice database ................. ............... .....110












LIST OF FIGURES


Figure page

1-1 Sawtooth waveform .............. ...............18....


1-2 Pure tone .............. ...............20....


1-3 Missing fundamental ................. ...............22........... ....


1-4 Square wave ................. ...............23........... ....

1-5 Pulse train............... ...............24.


1-6 Alternating pulse train............... ...............25.


1-7 Inharmonic signal............... ...............26.


1-8 Equivalent rectangular bandwidth. ............. ...............28.....


1-9 Equivalent-rectangul ar-b andwi dth scale ...._.._.._ ..... .._._. ....._.._.........2


2-2 Harmonic product spectrmm................ ...............3

2-3 Sub harmonic summation .............. ...............34....


2-4 Sub harmonic summation with decay ........................._ ...............35.....

2-5 Subharmonic to harmonic ratio............... ...............37.


2-6 Harmonic sieve .............. ...............38....


2-7 Autocorrelation .............. ...............40....


2-8 Comparison between AC, BAC, ASDF, and AMDF. ............. ...............42.....


2-9 Cepstrum ................. ...............44.......... .....


2-10 Problem caused to cepstrum by cosine lobe at DC ................. ...............45.............


3-1 Average-peak-to-valley-di stance kernel ................. ...............48................


3-3 Necessity of strictly convex kernels .............. ...............50....


3-4 Kernels formed from concatenations of truncated squarings, Gaussians, and cosines......5 1


3-5 Warping of the spectrum ................. ...............52........... ...


3-6 Weighting of the harmonics ................. ...............54........... ...











3-7 Fourier transform of rectangular window ...._.._.._ ........__. ...._.._ ...........5

3-8 Cosine lobe and square-root of the spectrum of rectangular window ........._.... ..............59

3 -9 H ann wind ow ..........._.__......... ...............60....

3-10 Fourier transform of the Hann window ................ ........................ ..............61


3-11 Cosine lobe and square-root of the spectrum of Hann window ........._...... ........_........61

3-12 SWIPE kernel............... ...............64.


3-13 Most common pitch estimation errors .............. ...............66....

3-14 SWIPE' kernel ........._...... ...............69._.._. ......


3-15 Pitch strength of sawtooth waveform .............. ...............70....

3 -16 Wind ows overlapping ................. ...............73................

3-17 Idealized spectral lobes ................. ...............75................

3-18 If-normalized inner product between template and idealized spectral lobes ...................77

3-19 Individual and combined pitch strength curves .............. ...............78....

3-20 Pitch strength loss when using suboptimal window sizes .............. ....................7

3-21 Coefficients of the pitch strength interpolation polynomial ............__.. ......__.........84

3-22 Interpolated pitch strength .............. ...............85....










LIST OF OBJECTS


Obl ect

Object 1-1.

Object 1-2.

Object 1-3.

Object 1-4.

Object 1-5.

Object 1-6.

Object 1-7.

Obj ect 2-1.

Obj ect 2-2.

Obj ect 3-1.


page

Sawtooth waveform (WAV file, 32 KB). ............. ...............18.....

Pure tone (WAV file, 32 KB). ............. ...............20.....

Missing fundamental (WAV file, 32 KB)............... ...............22..

Square wave (WAV file, 32 KB). .............. ...............23....

Pulse train (WAV file, 32 KB). ............. ...............24.....

Alternating pulse train (WAV file, 32 KB). ............. ...............25.....

Inharmonic signal (WAV file, 32 KB). ............. ...............26.....

Bandpass filtered /u/ (WAV file 6 KB) .............. ...............33....

Signal with strong second harmonic (WAV file, 32 KB) ................. ............... ....42

Beating tones (WAV file, 32 KB)............... ...............50..










LIST OF ABBREVIATIONS


AC Autocorrelation

AMDF Average magnitude difference function

APVD Average peak-to-valley distance

ASDF Average squared difference function

BAC Biased autocorrelation

CEP Cepstrum

ERB Equivalent rectangular bandwidth

ERBs Equivalent-rectangular-bandwidth scale

FFT Fast Fourier transform

HPS Harmonic product spectrum

HS Harmonic sieve

IP Inner product

IT Integral transform

ISL Idealized spectral lobe

Kt-NIP Kt-normalized inner product

NIP Normalized inner product

O-WS Optimal window size

P2-WS Power-of-two window size

SHS Subharmoni c- summati on

SHR Subharmoni c-to-harmoni c rati o

STFT Short-time Fourier transform

SWIPE Sawtooth Waveform Inspired Pitch Estimator

UAC Unbiased autocorrelation

WS Window size









Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

SWIPE: A SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR
FOR SPEECH AND MUSIC


By

Arturo Camacho

December 2007

Chair: John G. Harris
Major: Computer Engineering

A Sawtooth Waveform Inspired Pitch Estimator (SWIPE) has been developed for

processing speech and music. SWIPE is shown to outperform existing algorithms on several

publicly available speech/musical-instruments databases and a disordered speech database.

SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose

spectrum best matches the spectrum of the input signal. A decaying cosine kernel provides an

extension to older frequency-based, sieve-type estimation algorithms by providing smooth peaks

with decaying amplitudes to correlate with the harmonics of the signal. An improvement on the

algorithm is achieved by using only the first and prime harmonics, which significantly reduces

subharmonic errors commonly found in other pitch estimation algorithms.









CHAPTER 1
INTTRODUCTION

Pitch is an important characteristic of sound, providing information about the sound's

source. In speech, pitch helps to identify the gender of the speaker (pitch tends to be higher for

females than for males) (Wang and Lin, 2004), gives additional meaning to words (e.g., a group

of words can be interpreted as a question depending on whether the pitch is rising or not), and

may help to identify the emotional state of the speaker (e.g., joy produces high pitch and a wide

pitch range, while sadness produce normal to low pitch and a narrow pitch range) (Murray and

Arnott, 1993). Pitch is also important in music because it determines the names of the notes

(Sethares, 1998).

Pitch estimation also has applications in many areas that involve processing of sound:

music, communications, linguistics, and speech pathology. In music, one of the main

applications of pitch estimation is automatic music transcription. Musicologists are often faced

with music for which no transcription exists. Therefore, automated tools that extract the pitch of

a melody, and from there the individual musical notes, are invaluable tools for musicologists

(Askenfelt, 1979). Automated transcription has also been used in query-by-humming systems

(e.g., Dannenberg et al., 2004). These systems allow people to search for music in databases by

singing or humming the melody rather than typing the title of the song, which may be unknown

for the user or the database.

In communications, pitch estimation is used for speech coding (Spanias, 1994). Many

speech coding systems are based on the source-filter model (Fant, 1960), which models speech

as a filtered source signal. In some implementations, the source is either a periodic sequence of

glottal pulses (for voiced sound) or white noise (for unvoiced sound). Therefore, the correct

estimation of the glottal pulse rate is crucial for the correct coding of voiced speech.









Pitch estimators are useful in linguistics for the recognition of intonation patterns, which

are used, for example, in the acquisition of a second language (de Bot, 1983). Pitch estimators

are also used in speech pathology to determine speech disorders, which are characterized by high

levels of noise in the voice. Since most methods used to estimate noise are based on the

fundamental frequency of the signal (e.g., Yomoto and Gould, 1982), pitch estimators are of vital

importance in this area.

The goal of our work is to develop an automatic pitch estimator that operates on both

speech and music. The algorithm should be competitive with the best known pitch estimators,

and therefore be suitable for the many applications mentioned above. Furthermore, the algorithm

should provide a measure to determine if a pitch exists or not in each region of the signal. The

remaining sections of this chapter present several psychoacoustics definitions and phenomena

that will be used to explain the operation and rationale of the algorithm.

1.1 Pitch Background

1.1.1 Conceptual Definition

Several conceptual definitions of pitch have been proposed. The American Standard

Association (ASA, 1960) definition of pitch is

"Pitch is that attribute of auditory sensation in terms of which sounds may be ordered on a
musical scale,"

and the American National Standards Institute (ANSI, 1994) definition of pitch is

"Pitch is that auditory attribute of sound according to which sounds can be ordered on a
scale from low to high. Pitch depends mainly on the frequency content of the sound
stimulus, but it also depends on the sound pressure and the waveform of the stimulus."

These definitions mention an attribute that allows ordering sounds in a scale, but they say

nothing about what that attribute is.









We will propose another definition of pitch, which is based on the fundamental frequency

of a signal. The fundamental frequency fo of a signal (sound or no sound) exists only for periodic

signals, and is defined as the inverse of the period of the signal, where the period To of the signal

(a.k.a. fundamental period) is the minimum repetition interval of the signal x(t), i.e.,

T,, = min(T > 0 |f t: x(t) = x(t +T)\I. (1-1)

It is also possible, to define the fundamental frequency in the frequency domain:


= max >0 rO c k;i;~h k Sin(2nkft +95k) (1-2)

Although both equations are mathematically equivalent (i.e., it can be shown that fo = 1/To), they

are conceptually different: Equation 1-1 looks at the signal in the time domain, while

Equation 1-2 looks at the signal as a combination of sinusoids using a Fourier series expansion.

The key element for periodicity in Equation 1-1 is the equality in x(t) = x(t + T), and the key

element for periodicity in Equation 1-2 is the existence of components only at multiples of the

fundamental frequency. Unfortunately, no signal in nature is perfectly periodic because of

natural variations in frequency and amplitude, and contamination produced by noise.

Nevertheless, when listening to many natural signals, we perceive pitch. This suggests that, to

determine pitch, the brain probably uses either a modified version of Equation 1-1, where the

equality x(t) = x(t + T) is substituted by an approximation, or a modified version of Equation 1-2,

where noise and fluctuations in the frequency of the components are allowed. Based on this

suggestion, we define pitch as the perceived "fundamental frequency" of a sound, in other words,

as the estimate our brain does of the (quasi) fundamental frequency of a sound.

1.1.2 Operational Definition

Since the previous definitions of pitch do not indicate how to measure it, they are of no

practical use, and an operational definition of pitch is required. The usual way in which pitch is










measured is the following. A listener is presented with two sounds: a target sound, for which the

pitch is to be determined, and a matching sound. The matching sound is usually a pure tone,

although sometimes harmonic complex tones are used as well. The levels of the target and the

matching sounds are usually equalized to avoid any effect of differences in level in the

perception of pitch. The sounds are presented sequentially, simultaneously, or in any

combination of them, depending on the design of the experiment. The listener is asked to adjust

the fundamental frequency of the matching sound such that it matches the target sound, in the

sense of the conceptual definitions of pitch presented above. The fundamental frequency of the

matching sound is recorded and the experiment is repeated several times and with different

listeners. The data is summarized, and if the distribution of fundamental frequencies shows a

clear peak around a certain frequency, the target sound is said to have a pitch corresponding to

that frequency.

1.1.3 Strength

Some sounds elicit a strong pitch sensation, and some do not. For example, when we

speak, some sounds are highly periodic and elicit a strong pitch sensation (e.g., vowels), but

some do not (e.g., some consonants: /s/, /sh/, /p/, and /k/). In the case of musical instruments, the

attack tends to contain transient components that obscure the pitch, but they disappear quickly

letting the pitch show more clearly. The quality of the sound that allows us to determine whether

pitch exists is called pitch strength. Pitch strength is not a categorical variable but a continuum.

Also, pitch strength is independent of pitch: two sounds may have the same pitch and differ in

pitch strength. For example, a pure tone and a narrow band of noise centered at the frequency of

the tone have the same pitch, however, the pure tone elicit a stronger pitch sensation than the

noise.









Unfortunately, not much research exists on pitch strength, and the few studies that exist

have concentrated mostly on noise (Yost, 1996; Wiegrebe and Patterson, 1998), although some

have explored harmonic sounds as well (Fastl and Stoll, 1979; Shofner and Selas, 2002). In terms

of variety of sounds, the most complete study is probably Fastl and Stoll's, which included pure

tones, complex tones, and several types of noises. In that study, pure tones were reported to have

the strongest pitch among all sounds. However, other studies have found that pitch identification

improves as harmonics are added (Houtsma, 1990), which suggests that pitch strength increases

as well.

We hypothesize that our brain determines pitch by searching for a match between our

voice, produced or imagined, and the target signal for which pitch is to be determined, probably

based on their spectra. This hypothesis agrees with studies of pitch determination in which

subjects have been allowed to hum the target sound to facilitate pitch matching tasks (Houtgast,

1976). Based on this hypothesis, we believe that the higher the similarity of the target signal with

our voice, the higher its pitch strength. If the similarity is based on the spectrum, a signal will

have maximum pitch strength when its spectrum is closest to the spectrum of a voiced sound. If

we assume that voiced sounds have harmonic spectra with envelopes that decay on average as 1/f

(i.e., inversely proportional to frequency) (Fant, 1960), then a signal will have maximum pitch

strength if its spectrum has that structure.

An example of a signal with such property is a sawtooth waveform, which is exemplified

in Figure 1-1. A sawtooth waveform is formed by adding sines with frequencies that are

multiples of a common fundamental fo, and whose amplitude decays inversely proportional to

frequency:

S1
x(t) =- sin 2ikft (1-3)
k=1 k










Though sawtooth waveforms play a key role in our research, their importance resides in

their spectrum, and not in their time-domain waveform. In particular, the phase of the

components can be manipulated (destroying its sawtooth waveform shape) and the signal would

still play the same role in our work as the sawtooth waveform. In other words, it is assumed in

this work that what matters to estimate pitch and its strength is the amplitude of the spectral

components of the sound, and not their phase, which in fact is ignored here. However, phase

does play a role in pitch perception, as have been shown by some researchers (Moore, 1977;

Shackleton and Carlyon, 1994; Galembo, et al., 2001). These researchers have created pairs of

sounds that have the same spectral amplitudes but significantly different pitches, by choosing the

phases of the components appropriately. Nevertheless, it is not the aim of this research to cover

the whole range of pitch phenomena, but to concentrate only on the most common speech and











0 10 20 30 40 50
Time; (ms)





S0.

0 100 200 300 400 500 600 700 800 900 1000
Frequency (Hz)

Figure 1-1. Sawtooth waveform. A) Signal. B) Spectrum.

Object 1-1. Sawtooth waveform (WAV file, 32 KB).










musical instruments sounds. As we will see later, good pitch predictions are obtained for these

types of sounds based solely on the amplitude of their spectral components.

1.1.4 Duration Threshold

Doughty and Garner (1947) studied the minimum duration required to perceive a pitch for

a pure tone. They found that there are two duration thresholds with two different properties.

Tones with durations below the shorter threshold are perceived as a click, and no pitch is

perceived. Tones with durations between the two thresholds are perceived as having pitch, and

an increase in their duration causes an increase in their pitch strength. Tones with durations

above the largest threshold are also perceived as having pitch, but further increases in their

duration do not increase their pitch strength.

These thresholds are not constant, but approximately proportional to the pitch period of the

tone. In other words, the threshold corresponds to a certain number of periods of the tone.

However, there is some interaction between pitch and the minimum number of cycles required to

perceive it (lower frequencies have a tendency to require fewer cycles to elicit a pitch). The

shorter threshold is approximately two to four cycles, and the larger threshold is approximately

three to ten cycles. For frequencies above 1 kHz the thresholds become constant: 4 ms the

shorter and 10 ms the larger, regardless of their corresponding number of cycles.

Robinson and Patterson (1995a; 1995b) studied note discriminability as a function of the

number of cycles in the sound using strings, brass, flutes, and harpsichords. A large increase in

discriminability can be observed in their data as the number of cycles increases from one to

about ten, but beyond ten cycles the discriminability of the notes does not seem to increase

much. This trend agrees with the thresholds for pure tones mentioned above, which suggests that

the thresholds are also valid for musical instruments, and probably for sawtooth waveforms as

well.










1.2 Illustrative Examples and Pitch Determination Hypotheses

In previous sections, conceptual and operational definitions of pitch were given. From a practical

point of view, both types of definitions are of limited use since the conceptual definitions are too

abstract and the operational definition requires a human to determine the pitch. In this section we

propose more algorithmic ways for determining pitch, through the search for cues that may give

us hints regarding the pitch. These cues, hereafter referred as hypotheses, are illustrated with

examples of sounds in which they are valid, and examples in which they are not.

1.1.2 Pure Tone

From a frequency domain point of view, the simplest periodic sound is a pure tone. A pure

tone with a frequency of 100 Hz and its spectrum is shown in Figure 1-2. Based on our

operational definition of pitch (i.e., the one that uses a pure tone as matching tone presented at

the same intensity level as the testing tone), the pitch of a pure tone is its frequency, and











0 10 20 30 40 50
Time (ms)






0 '100 200 300 400 500
Frequency (Hz)

Figure 1-2. Pure tone. A) Signal. B) Spectrum.

Object 1-2. Pure tone (WAV fie, 32 KB).









therefore frequency determines pitch in this case. This may not be true if the tones are presented

at different levels. Intriguingly, the pitch of a pure tone may change with intensity level (Stevens,

1935): as intensity increases, the pitch of high frequency tones tends to increase, and the pitch of

low frequency tones tends to decrease. However, this change is usually less than 1% or 2%

(Verschuure and van Meeteren, 1975), occurs at very disparate intensity levels, and varies

significantly from person to person.

Since the goal in this research is to predict pitch for sounds represented in a computer as a

sequence of numbers without knowing the level at which the sound will be played, it will be

assumed that the sound will be played at a "comfortable" level, and therefore the algorithm will

be designed to predict the pitch at that level. Nevertheless, variations of pitch with level are

small, and have little effect even for complex tones (Fastl, 2007), otherwise, music would

become out of tune as we change the volume.

1.2.2 Sawtooth Waveform and the Largest Peak Hypothesis

The sawtooth waveform presented in Section 1.1.3 was shown to have a harmonic

spectrum with components whose amplitude decays inversely proportional to frequency (see

Figure 1-1). The computational determination of the pitch of a sawtooth waveform is not as easy

as it is for a pure tone because its spectrum has more than one component. Since the pitch of a

sawtooth waveform corresponds to its fundamental frequency, and the fundamental frequency in

this case is the component with the highest energy, one possible hypothesis for the derivation of

the pitch is that the pitch corresponds to the largest peak in the spectrum. However, as we will

show in the next section, this hypothesis does not always hold.

1.2.3 Missing Fundamental and the Components Spacing Hypothesis

This section shows that it is possible to create a periodic sound with a pitch corresponding

to a frequency at which there is no energy in the spectrum. A sound with such property is said to


















0 10 20 30 40 50
Time; (ms)







0 100 200 300 400 500 600 700 800 900 1000
Frequency (Hz)

Figure 1-3. Missing fundamental. A) Signal. B) Spectrum.

Object 1-3. Missing fundamental (WAV file, 32 KB).



have a missing fund amental.~dd~~dd~~dd It is easy to build such a signal: just take a sawtooth waveform and

remove its fundamental, as shown in Figure 1-3. Certainly, the timbre of the sound will change,

but not its pitch. This fact disproves the hypothesis that the pitch corresponds to the largest peak

in the spectrum.

After it was realized that the pitch of a complex tone was unaffected by removing the

fundamental frequency, it was hypothesized that the pitch corresponds to the spacing of the

frequency components. However, this hypothesis is not always valid, as we will show in the next

section.

1.2.4 Square Wave and the Maximum Common Divisor Hypothesis

The previous section hypothesized that the pitch corresponds to the spacing between the

frequency components. However, it is easy to find an example for which this hypothesis fails: a

















0 10 20 30 40 50
Time; (ms)





S0.

0 100 200 300 400 500 600 700 800 900 1000
Frequency (Hz)

Figure 1-4. Square wave. A) Signal. B) Spectrum.

Object 1-4. Square wave (WAV file, 32 KB).



square wave. A square wave is similar to a sawtooth wave, but does not have even order

harmonics:

S1
x(t) = C sin 2jt(2k 1) fat (1-4)
k=1 (2k 1)

A square wave with a fundamental frequency of 100 Hz and its spectrum is shown in

Figure 1-4. The components are located at odd multiples of 100 Hz, producing a spacing of 200

Hz between them. However, the fundamental frequency, and indeed its pitch, is 100 Hz. Thus,

the components spacing hypothesis is invalid.

A hypothesis that seems to work for this example, and all the previous ones, is that the

pitch must correspond to the maximum common divisor of the frequency components. As shown

in Equation 1-2, this is equivalent to saying that the pitch corresponds to the fundamental

frequency. However, we will show in the next section that this hypothesis is also wrong.


























































0 10 20 30 40 50 60 70 80 90 1 {
Time; (ms)


1.2.5 Alternating Pulse Train


A pulse train is a sum of pulses separated by a constant time interval To:



x(t)= [3(t -k%), (1-5)



where 3 is the delta or pulse function, a function whose value is one if its argument is zero, and


zero otherwise. A pulse train with a fundamental frequency of 100 Hz (fundamental period of 10


ms) and its spectrum are shown in Figure 1-5. The spectrum of a pulse train is another pulse train


with pulses at multiples of the fundamental frequency, which corresponds to the pitch. If the


signal is modified by decreasing the height of every other pulse in the time domain to 0.7, as


shown in Figure 1-6, the period of the signal will change to 20 ms. This will be reflected in the


spectrum as a change in the fundamental frequency from 100 Hz to 50 Hz. However, although


this change may cause an effect on the timbre (depending on the overall level of the signal), the


m
rr
sg
r~


0.5



0b 100 200 300 400 500 600 700 800 900 10(
Frequency (Hz)


Figure 1-5. Pulse train. A) Signal. B) Spectrum.

Object 1-5. Pulse train (WAV file, 32 KB).


E
2
fi
a,
8,
V)


00














1- 1


m
rr
sg
r~


0 10 20 30 40 50 60 70 80 90 100
Time; (ms)


1 .



0 100 200 300 400 500 600 700 800 900 1000
Frequency (Hz)

Figure 1-6. Alternating pulse train. A) Signal. B) Spectrum.

Object 1-6. Alternating pulse train (WAV file, 32 KB).



pitch will remain the same: 100 Hz, refuting the hypothesis that the pitch of a sound corresponds

to its fundamental frequency (i.e., the maximum common divisor of the frequency components).

1.2.6 Inharmonic Signals

This section shows another example of a signal whose pitch does not correspond to its

fundamental frequency (i.e., the maximum common divisor of its frequency components).

Consider a signal built from the 13th, 19th, and 25th harmonics of 50 Hz (i.e., 650, 950, and

1250 Hz), as shown in Figure 1-7. Its fundamental frequency is 50 Hz, but its pitch is 334 Hz

(Patel and Balaban, 2001). This is interesting since the ratios between the components and the

pitch are far from being integer multiples: 1.95, 2.84, and 3.74. In any case, the pitch of the

signal no longer corresponds to its fundamental frequency. Although the true period of the signal

is To = 20 ms, the signal peaks about every 3 ms, which corresponds to the pitch period of the


















0 10 20 30 40 50
Time; (ms)







650 950 1250
Frequency (Hz)

Figure 1-7. Inharmonic signal. A) Signal. To corresponds to the fundamental period of the signal
and to corresponds to the pitch period. B) Spectrum.

Object 1-7. Inharmonic signal (WAV file, 32 KB).



signal to (see Panel A). These type of signals for which the components are not integer multiples

of the pitch are called inharmonic signals.

1.3 Loudness

Loudness is another perceptual quality of sound that provides us with information about its

source. It is important for pitch because the unification of the components of a sound into a

single entity, for which we identify a pitch, may be mediated by the relative loudness of the

components of the sound.

A conceptual definition of loudness is (Moore, 1997)

"...that attribute of auditory sensation in terms of which sounds can be ordered on a scale
extending from quiet to loud."









The most common unit to measure loudness is the sone. A sone is defined as the loudness

elicited by a 1 k
usually modeled as a power function of the sound pressure P of the tone, i.e.,

L =k P", (1-6)

where k is a constant that depends on the units and a is the exponent of the power law.

In a review of loudness studies, Takeshima et. al (2003) found that the value of a is

usually reported to be within the range 0.4-0.6. They also reviewed more elaborate models with

many more parameters, but for simplicity, in this work we will use the simpler power model, and

for reasons we will explain later, we choose the value of a to be 0.5. In other words, we model

the loudness of a tone as being proportional to the square-root of its amplitude.

1.4 Equivalent Rectangular Bandwidth

The bandwidth and the distribution of the filters used to extract the spectral components of

a sound are important issues that may affect our perception of pitch. Since each point of the

cochlea responds better to certain frequencies than others, the cochlea acts as a spectrum

analyzer. The bandwidth of the frequency response of each point of the cochlea is not constant

but varies with frequency, being almost proportional to the frequency of maximum response at

each point (Glasberg and Moore, 1990).

The concept of Equivalent Rectangular Bandwidth (ERB) was introduced as a description

of the spread of the frequency response of a filter. The ERB of a filter F is defined as the

bandwidth (in Hertz) of a rectangular filter R centered at the frequency of maximum response of

F, scaled to have the same output as F at that frequency, and passing the overall same amount of

white noise energy as F. In other words, when the power responses of F and R are plotted as a











Auldltory filter
---- ERB filter













0 0.5 1 1.5
Frequency (knz)

Figure 1-8. Equivalent rectangular bandwidth.



function of frequency, as in Figure 1-8, the central frequency of R corresponds to the mode of F,

and both curves have the same height and area.

Glasberg and Moore (1990) studied the response of auditory filters at different frequencies,

and proposed the following formula to approximate the ERB of the filters:

ERB( f) = 24.7 + 0. 108 f (1-7)

Another property of the cochlea is that the relation between frequency and site of

maximum excitation in the cochlea is not linear. If the distance between the apex of the cochlea

and the site of maximum excitation of a pure tone is plotted as a function of frequency of the

tone, it will be found that a displacement of 0.9 mm in the cochlea corresponds approximately to

one ERB (Moore, 1986). Therefore, it is possible to build a scale to measure the position of

maximum response in the cochlea for a certain frequency fby integrating Equation 1-7 to obtain

the number of ERBs below f and then multiplying it by 0.9 mm to obtain the position. However,


























0 2 4 6 8 10
Frequency (knz)

Figure 1-9. Equivalent-rectangular-bandwidth scale.



it is common practice in psychoacoustics to merely compute the number of ERBs below f which

can be computed as

ERB s(f) = 21.4 log,, (1 + f /229)(18

This scale is shown in Figure 1-9, and it will be the scale used by SWIPE to compute spectral

similarity.

1.5 Dissertation Organization

The rest of this dissertation is organized as follows. Chapter 2 presents previous pitch

estimation algorithms that are related to SWIPE, their problems and possible solutions to these

problems. Chapter 3 will discuss how these problems, plus some ones and their solutions, lead to

SWIPE. Chapter 4 evaluates SWIPE using publicly available speech/music databases and a

disordered speech database. Publicly available implementations of other algorithms are also

evaluated on the same databases, and their performance is compared against SWIPE' s.









1.6 Summary

Here we have presented the motivations and applications for pitch estimation. Then, we

presented conceptual and operational definitions of pitch, together with the related concept of

pitch strength and the duration threshold to perceive pitch. Next, we presented examples of

signals and their pitch, together with hypotheses about how pitch is determined. The sawtooth

waveform was highlighted, since it plays a key role in the development of SWIPE.

Psychoacoustic concepts such as inharmonic signals, loudness, and the ERB scale were also

introduced since they are also relevant for the development of SWIPE.









CHAPTER 2
PITCH ESTIMATION ALGORITHMS: PROBLEMS AND SOLUTIONS

This chapter presents some well known pitch estimation algorithms that appear in the

literature. These algorithms were chosen because of their influence upon the creation of SWIPE.

We will present the algorithms in a very basic form with the intent to capture their essence in a

simple expression, although their actual implementations may have extra details that we do not

present here. The purpose of those details is usually to fine tune the algorithms, but the actual

power of the algorithms is based on the essence we describe here.






Signal





Windows a-


STFTI

Spectrum


~ITI "'

Score



SPitch


Figure 2-1. General block diagram of pitch estimators.









Traditionally, there have been two types of pitch estimation algorithms (PEAs): algorithms

based on the spectrmm of the signal, and algorithms based on the time-domain representation of

the signal. The time-domain based algorithms presented in this chapter can also be formulated

based on the spectr-um of the signal, which will be the approach followed here.

The basic steps that most PEAs perform to track the pitch of a signal are shown in the

block diagram of Figure 2-1. First, the signal is split into windows. Then, for each window the

following steps are performed: (i) the spectrum is estimated using a short-time Fourier transform

(STFT), (ii) a score is computed for each pitch candidate within a predefined range by computing

an integral transform (IT) over the spectrum, and (iii) the candidate with the highest score is

selected as the estimated pitch. The algorithms will be presented in an order that is convenient

for our purposes, but does not necessarily correspond to the chronological order in which they

were developed.

2.1 Harmonic Product Spectrum (HPS)

The first algorithm to be presented is Harmonic Product Spectrum (HPS) (Schroeder,

1968). This algorithm estimates the pitch as the frequency that maximizes the product of the

spectrum at harmonics of that frequency, i.e. as


p =arg max | X(klf ) | (2-1)
f k=1

where X is the estimated spectrum of the signal, n is the number of harmonics to be used

(typically between 5 and 11), and p is the estimated pitch. The purpose of limiting the number of

harmonics to n is to reduce the computational cost, but there is no logical reason behind this

limit; it is hard to believe that the n-th harmonic is useful for pitch estimation, but not the n+1-th.

SSince all the pitch estimators presented here use the magnitude of the spectrum but not its phase, the words
"magnitude of' will be omitted, and the word spectrum should be interpreted as magnitude of the spectrum unless
explicitly noted otherwise.











Spectrum
Kernel

1-











0 0.5 1 1.5
Frequency (kHz()

Figure 2-2. Harmonic product spectrum.

Obj ect 2-1. Bandpass filtered /u/ (WAV Hile 6 KB)



Since the logarithm is an increasing function, an equivalent approach is to estimate the

pitch as the frequency that maximizes the logarithm of the product of the spectrum at harmonics

of that frequency. Since the logarithm of a product is equal to the sum of the logarithms of the

terms, HPS can also be written as


p =arg max flog| X(kf) | (2-2)
I k=1

or using an integral transform, as


p = arg max~ log |X( f')| f3(f'-kf ) df '. (2-3)
f k=1

Figure 2-2 shows the kernel of this integral for a pitch candidate with frequency 190 Hz.

A pitfall of this algorithm is that if any of the harmonics is missing (i.e., its energy is zero),

the product will be zero (equivalently, the sum of the logarithms will be minus infinity) for the

candidate corresponding to the pitch, and therefore the pitch will not be recognized. Figure 2-2










also shows the spectrum of the vowel /u/ (as in good) with a pitch of 190 Hz (Object 2-1). This

sample was passed through a filter with a bandpass range of 300-3400 Hz to simulate telephone-

quality speech. Therefore, the fundamental is missing and HPS is not able to recognize the pitch

of this signal. Another salient characteristic of this sample is its intense second harmonic at 380

Hz, caused probably by the first formant of the vowel, which is on average around 380 Hz as

well (Huang, Acero, and Hon, 2001).

2.2 Sub-harmonic Summation (SHS)

An algorithm that has no problem with missing harmonics is Sub-Harmonic Summation

(SHS) (Hermes, 1988), which solves the problem by using addition instead of multiplication.

Therefore, if any harmonic is missing, it will not contribute to the total, but will not bring the

sum to zero either. In mathematical terms, SHS estimates the pitch as


p =argmaxf| X(kf) |, (2-4)
f k=1




Spectrulm
Kernel













0 0.5 1 1.5
Frequency (kHz)

Figure 2-3. Subharmonic summation.








































Spectrulm
- Kernel


or using an integral transform as


p = arg max |X( f')| G( f'-kf) df '. (2-5)
f k=1

An example of the kernel of this integral is shown in Figure 2-3.

A pitfall of this algorithm is that since it gives the same weight to all the harmonics,

subharmonics of the pitch may have the same score as the pitch, and therefore they are valid

candidates for being recognized as the pitch. For example, suppose that a signal has a spectrum

consisting of only one component at fHz. By definition, the pitch of the signal is fHz as well.

However, since the algorithm adds the spectrum at n multiples of the candidate, each of the

subharmonics f/2, f/3,..., f/n will have the same score asJ; and therefore they are equally valid to

be recognized as the pitch.


0 0.5 1 1.5 2
Frequency (kHz)

Figure 2-4. Subharmonic summation with decay.









This problem can be solved by introducing a monotonically decaying weighting factor for

the harmonics. SHS implements this idea by weighting the harmonics with a geometric

progression as


p = arg max | IX( f') | r k-1 '-kf )df (2-6)
f k=1

where the value of r was empirically set to 0.84 based on experiments using speech. The kernel

of this integral is shown in Figure 2-4. SHS is the only algorithm in this chapter that solves the

subharmonic problem by applying this decay factor. Later, another algorithm will be presented

(Biased Autocorrelation) which solves this problem in a different way.

2.3 Subharmonic to Harmonic Ratio (SHR)

A drawback of the algorithms presented so far is that they examine the spectrum only at

the harmonics of the fundamental, ignoring the contents of the spectrum everywhere else. An

example will illustrate why this is a problem. Suppose that the input signal is white noise (i.e., a

signal with a flat spectrum). This signal is perceived as having no pitch. However, the previous

algorithms will produce the same score for each pitch candidate, making each of them a valid

estimate for the pitch.

This problem is solved by the Subharmonic to Harmonic Ratio algorithm (SHR) (Sun,

2000), which not only adds the spectrum at harmonics of the pitch candidate, but also subtracts

the spectrum at the middle points between harmonics. However, this algorithm uses the

logarithm of the spectrum, and therefore has the problem previously discussed for HPS. Also,

this algorithm gives the same weight to all the harmonics and therefore it suffers from the

subharmonics problem. SHR can be written as


p = arg maxI log |X(f f')| 3(f'-kf) (f'-(k 1 2) f) df' .(2-7)
f k=1











Spectrum
-- Kernel


-1
0 0.5 1 1.5 2
Frequency (kHz()

Figure 2-5. Subharmonic to harmonic ratio.



The kernel of the integral is shown in Figure 2-5. Notice that SHR will produce a positive score

for a signal with a harmonic spectrum and a score of zero for white noise. However, this

algorithm has a problem that is shared by all the algorithms presented so far: since they examine

the spectrum only at harmonic locations, they cannot recognize the pitch of inharmonic signals.

Before we move on to the next algorithm, we wish to add some insight to SHR. If we

further divide the sum in Equation 2-7 by n, the algorithm would compute the average peak-to-

valley ratio, where the peaks are expected to be at the harmonics of the candidate, and the valleys

are expected to be at the middle point between harmonics. This idea will be exploited later by

SWIPE, albeit with some refinements: the average will be weighted, the ratio will be replaced

with the distance, and the peaks and valleys will be examined over wider and blurred regions.

2.4 Harmonic Sieve (HS)

One algorithm that is able to recognize the pitch of some inharmonic signals is the

Harmonic Sieve (HS) (Duifhuis and Willems, 1982). This algorithm is similar to SHS, but has










two key differences: instead of using pulses it uses rectangles, and instead of computing the inner

product between the spectrum and the rectangles, it counts the number of rectangles that contain

at least one component (a rectangle is said to contain a component if the component fits within

the rectangle and its amplitude exceeds a certain threshold T). The rectangles are centered at the

harmonics of the pitch candidates, and their width is 8% of the frequency of the harmonics. This

algorithm can be expressed mathematically as


p =argmax T < max |X(f'| (28
=-p,-k=1 7 f't(0.96k~f,1.04k~f) xf ] 2

where [-] is the Iverson bracket (i.e., produces a value of one if the bracketed proposition is true,

and zero otherwise). Notice that the expression in the sum is a non-linear function of the

spectrum, and therefore this algorithm cannot be written using an integral transform. Figure 2-6

shows the kernel used by this algorithm.





Spectrulm
Kernel



1









0 0.5 1 1.5
Frequency (kHz)

Figure 2-6. Harmonic sieve.










A pitfall of HS is that, when a component is close to an edge of a rectangle, a slight change

in its frequency could put it in or out of the rectangle, possibly changing the estimated pitch

drastically. Such radical changes do not typically occur in pitch perception, where small changes

in the frequency of the components lead to small changes in the perceived pitch, as mentioned in

Section 1.2.6. This problem can be solved by using smoother boundaries to decide whether a

component should be considered as a harmonic or not, as done by the next algorithm.

2.5 Autocorrelation (AC)

One of the most popular methods for pitch estimation is autocorrelation. The

autocorrelation function r(t) of a signal x(t) measures the correlation of the signal with itself after

a lag of size t, i.e.,


-T 2


The Wiener-Khinchin theorem shows that autocorrelation can also be computed as the inverse

Fourier cosine transform of the squared spectrum of the signal, i.e., as


r(t)=j |X( f) |cos(2xft) df .(2-10)


The autocorrelation-based pitch estimation algorithm (AC) estimates the pitch as the frequency

whose inverse maximizes the autocorrelation function of the signal, i.e., as


p = arg max |X( f') | cos(2nf' /f) df', (2-11)
f~fn0

where the parameter fma is introduced to avoid the maximum that the integral has at infinity. The

kernel for this integral is shown in Figure 2-7. It is easy to see that as f increases, the kernel

stretches without limit, and since the cosine starts with a value of one and decays smoothly,

eventually it will give a weight of one to the whole spectrum, producing a maximum at infinity.











Spectrum
Kernel













0 1 2 3 0 1 2
Frequency (kHz)

Figure 2-7. Autocorrelation.



Notice that this problem can be easily solved by removing the first quarter of the first cycle of

the cosine (i.e., setting it to zero). Since the DC of a signal (i.e., it zero-frequency component)

only adds a constant to the signal, ignoring the DC should not affect the pitch estimation of a

periodic signal.

Because of the frequency domain representation of autocorrelation, we can see that there is

a large resemblance between AC and SHR (compare the kernel of Figure 2.7 with the kernel of

Figure 2.5), although with three main differences. First, instead of using an alternating sequence

of pulses, AC uses a cosine, which adds a smooth interpolation between the pulses. Second, AC

adds an extra lobe at DC, which was already shown to have a negative effect. Third, AC uses the

power of the spectrum (i.e., the squared spectrum) instead of the logarithm of the spectrum.

Therefore, both algorithms measure the average peak-to-valley distance, one in the power

domain and the other in the logarithmic domain, although AC does it in a much smoother way.









There is also a similarity between AC and HS (compare the kernel of Figure 2.7 with the

kernel of Figure 2.6). HS allows for inharmonicity of the components of the signal by

considering as harmonic any component within a certain distance from a harmonic of the

candidate pitch. AC does the same in a smoother way by assigning to a component a weight that

is a function of its distance to the closer harmonic of the candidate pitch; the smaller the distance,

the larger the weight, and the further the distance, the smaller the weight. In fact, if the

component is too far from any harmonic, its weight can be negative.

Like all the algorithms presented so far, except SHS, AC exhibits the subharmonics problem

caused by the equal weight given to all the harmonics (see Section 2.2). To solve this problem, it

is common to take the local maximum of highest frequency rather than the global maximum.

However, this technique sometimes fails. For example, consider a signal with fundamental

frequency 200 Hz (i.e., period of 5 ms) and first four harmonics with amplitudes 1,6,1,1, as

shown in Figure 2-8A (Object 2-2). Except at very low intensity levels, the four components are

audible, and the pitch of the signal corresponds to its fundamental frequency. However, as shown

in Figure 2-8C, AC has its first non-zero local maximum at 2.5 ms, which corresponds to a pitch

of 400 Hz.

Another common solution is to use the biased autocorrelation (BAC) (Sondhi, 1968;

Rabiner, 1977), which introduces a factor that penalizes the selection of low pitch values. This

factor gives a weight of one to a pitch period of zero and decays linearly to zero for a pitch

period corresponding to the window size T. This can be written as


p = arg maxfc/f,, 1 (f)|'cs2f /f f (2-12)

































0


- -

0 200 400 600 800 101
Frequency (Hz)


/i Y
Y
Y


15


E
2 6
04q
c~O0


A

00


m
r
rnY


VI


,i


O i
Q


I~


20


5


10
Time (ms)


Figure 2-8. Comparison between AC, BAC, ASDF, and AMDF. A) Spectrum of a signal with
pitch and fundamental frequency of 200 Hz. B) Waveform of the signal with a
fundamental period of 5 msec. C) AC has a maximum at every multiple of 5 ms,
making it hard to choose the best candidate. The first (non-zero) local maxima is at
2.5 ms, making the "first peak" criteria to fail. D) BAC has its first peak and its non-
zero largest local maximum at 2.5 ms. E) ASDF is an inverted, shifted, ands scaled
AC. F) AMDF is similar to ASDF.

Obj ect 2-2. Signal with strong second harmonic (WAV file, 32 KB)



However, the combination of this bias and the squaring of the spectrum may introduce new

problems. For example, if T= 20 ms as in the BAC function of Figure 2-8D, the bias will make

the height of the peak at 2.5 ms larger than the height of the peak at 5 ms, consequently causing

an incorrect pitch estimate.


B










2.6 Average Magnitude and Squared Difference Functions (AMDF, ASDF)

Two functions similar to autocorrelation (in the sense that they compare the signal with

itself after a lag of size t) are the magnitude difference function (AMDF) and the average

squared difference function (ASDF). The AMDF is defined as

T/2
d t) = x(t') -x(t'+t)| dt', (2-13)
T/2

and the ASDF as

T/2
st) = [x(t') -x(t'+t)]' dt'. (2-14)
T/2

It is easy to show that ASDF and autocorrelation are related through the equation (Ross, 1974)

s~t)= 2((0)- r~)),(2-15)

and therefore, s(t) is just an inverted, shifted, and scaled version of autocorrelation. Therefore, as

illustrated in the panels C (or D) and E of Figure 2-8, where (biased) autocorrelation has peaks,

s(t) has dips. Thus, an ASDF-based algorithm must look for minima instead of maxima to

estimate pitch.

It has also been shown (Ross, 1974) that d(t) can be approximated as


d(t) P(t) [s(t)] /. (2-16)

Although the relation between d(t) and s(t) depends on t through P(t), it is found in practice that

this factor does not play a significant role, and a large similarity between d(t) and s(t) exists, as

observed in panels E and F of Figure 2-8. Therefore, since the functions r(t), s(t), and d(t) are so

strongly related, none of them is expected to offer much more than the others for pitch

estimation. However, modifications to these functions, which cannot be expressed in terms of the

other functions, have been used successfully to improve their performance on pitch estimation.










An example is given by YIN (de Cheveigne, 2002), which uses a variation of s(t) to avoid the dip

at lag zero, improving its performance. Another variation is the one we proposed in the previous

section (i.e., the removal of the first quarter of the cosine) to avoid the maximum at zero lag for

autocorrelation.

2.7 Cepstrum (CEP)

An algorithm similar to AC is the cepstrum-based pitch estimation algorithm (CEP) (Noll,

1967). The cepstrum c(t) of a signal x(t) is very similar to its autocorrelation. The only difference

is that it uses the logarithm of the spectrum instead of its square, i.e.,



c(t) = log | X( f) |cos(2xft) df .(2-17)


CEP estimates the pitch as the frequency whose inverse maximizes the cepstrum of the signal,

i.e., as





Spectrulm
Kernel













0 0.5 1 1.5
Frequency (kHz)

Figure 2-9. Cepstrum.











p = arg maxj log |X(f') |cos(2f'/ f) df'. (2-18)
f
The kernel of this integral is shown in Figure 2-9. Like AC, CEP exhibits the subharmonics

problem and the problem of having a maximum at a large value of f The maximum is not

necessarily at infinity because, depending on the scaling of the signal, the logarithm of the

spectrum may be negative at large frequencies, and therefore assigning a positive weight to that

region may in fact decrease the score. Figure 2-10 shows the spectrum of the speech signal that

has been used in previous figures and the kernel that produces the highest score for that

spectrum, which corresponds to a candidate pitch of about 10 k
the spectrum was arbitrarily set to zero for frequencies below 300 Hz because its original value

(minus infinity) would make unfeasible the evaluation of the integral in Equation 2-18. This

problem of the use of the logarithm when there are missing harmonics was already discussed in

Section 2.1.





Spectrulm
Kernel













0 0.5 1 1.5
Frequency (kHz)

Figure 2-10. Problem caused to cepstrum by cosine lobe at DC.









2.8 Summary

In this chapter we presented pitch estimation algorithms that have influenced the creation

of SWIPE. The most common problems found in these algorithms were the inability to deal with

missing harmonics (HPS, SHR, and CEP) and inharmonic signals (HPS, SHS, and SHR), and the

tendency to produce high scores for subharmonics of the pitch (all the algorithms, although to a

lesser extent SHS and BAC). Solutions to these problems were either found in other algorithms

or were proposed by us.









CHAPTER 3
THE SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR

Aiming to improve upon the algorithms presented in Chapter 2, we propose the Sawtooth

Waveform Inspired Pitch Estimator (SWIPE)2. The seed of SWIPE is the implicit idea of the

algorithms presented in Chapter 2: to find the frequency that maximizes the average peak-to-

valley distance at harmonics of that frequency. However, this idea will be implemented trying to

avoid the problem-causing features found in those algorithms. This will be achieved by avoiding

the use of the logarithm of the spectrum, applying a monotonically decaying weight to the

harmonics, observing the spectrum in the neighborhood of the harmonics and middle points

between harmonics, and using smooth weighting functions.

3.1 Initial Approach: Average Peak-to-Valley Distance Measurement

If a signal is periodic with fundamental frequency J; its spectrum must contain peaks at

multiples of fand valleys in between. Since each peak is surrounded by two valleys, the average

peak-to-valley distance (APVD) for the k-th peak is defined as

dkf) |X~kf)|-|X((k-1/2)f)| ]+1 | Xkf)|-|X((k+1/2)f)|
2 2


= |X(kf )l | 1 X((k -1/2)1 f |+ |X((k +1/2) f ) l | (3-1)


Averaging over the first n peaks, the global APVD is

1
D,, ( f )= d
YZk=1

11 1
Z |Xf2) |X((n+1/2)f)|+i |X(kf.)|-|X((k-1/2)f)| (3-2)





2 The name of the algorithm will become clear in a posterior section.












Spectrum
Kernel




0.5-





-0.5-


-1,
0 0.5 1 1.5
Frequency (kHz()


Figure 3-1. Average-peak-to-valley-di stance kernel.




Our first approach to estimate pitch is to find the frequency that maximizes the APVD. Staying

with the integral transform notation used in Chapter two, and dropping the unnecessary 1/n term,

the algorithm can be expressed as



p = argmax | X(f')| K,(f,f') df.', (3-3)
f
where


K,(, f)=1(f/ ) 1 s((n+1/2)f'/ )~(~'f)+ 6(kf' )l(-/2)f'/ f). (3-4)
2 2 k=1

The kernel K,,(~ff ') for f= 190 Hz is shown in Figure 3-1 together with the spectrum of the

sample vowel /u/ used in Chapter 2, which will be used extensively in this chapter as well. The

kernel is a function not only of the frequencies but also of n, the number of harmonics to be used.

Each positive pulse in the kernel has a weight of 1, each negative pulse between positive pulses

has a weight of -1, and the first and last negative pulses have a weight of -1/2. This kernel is










similar to the kernel used by SHR (see Chapter 2), with the only difference that in K,, the first

negative pulse has a weight of -1/2 and K,, has an extra negative pulse at the end, also with a

weight of -1/2.

3.2 Blurring of the Harmonics

The previous method of measuring the APVD works if the signal is harmonic, but not if it

is inharmonic. To allow for inharmonicity, our first approach was to blur the location of the

harmonics by replacing each pulse with a triangle function with base f/2,

f /4-|I f '| ,if | f '|< f /4
A, ( ') =(3-5)
0 otherwise.

The base of the triangle was set to f/2 to produce a triangular wave as shown in Figure 3-2. To be

consistent with the APVD measure, the first and last negative triangles were given a height of

1/2. One reason for using a base that is proportional to the candidate pitch is that it allows for a

pitch-independent handling of inharmonicity, as seems to be done in the auditory system (see

section 1.2.6).



Spectrum
Kernel






-1 I;





0 0.5 1 1.5
Frequency (kHz()

Figure 3-2. Triangular wave kernel.

















0.5-




-1 i



0 100 200 300 400 500 600
Frequency (Hz)

Figure 3-3. Necessity of strictly convex kernels.

Object 3-1. Beating tones (WAV file, 32 KB)



The triangular kernel approach was abandoned because it was found that the components

of the kernel must be strictly concave (i.e., must have a continuous second derivative) at their

maxima. The following example will illustrate why this is necessary. Suppose we have a signal

with components at 200 and 220 Hz, as shown in Figure 3-3 (Object 3-1). This signal is

perceived as a loudness-varying tone with a pitch of 210 Hz, phenomena known as beating.

However, the triangular kernel produces the same score for each candidate between 200 and 220

Hz. This is easy to see by slightly stretching or compressing the kernel such that its first positive

peak is within that range. Such stretching or compression would cause an increment on the

weight of one of the components and a decrement of the same amount on the other, keeping the

score constant.

Therefore, the triangle was discarded and concatenations of truncated squarings,

Gaussians, and cosines were explored. The squaring function was truncated at its fixed point, and


























0 f 2f 3f
Frequency (Hz)

Figure 3-4. Kernels formed from concatenations of truncated squarings, Gaussians, and cosines.



the Gaussian and the cosine functions were truncated at their inflection points. The Gaussian was

truncated at the inflection points to ensure that the concatenation of positive and negative

Gaussians have a continuous second derivative. The same can be said about the cosine, but

furthermore, the concatenation of positive and negative cosine lobes produces a cosine, which

has all order derivatives.

Concatenations of these three functions, stretched or compressed to form the desired

pattern of maxima at multiples of the candidate pitch, are illustrated in Figure 3-4. Although

informal tests showed no significant differences in pitch estimation performance among the

three, the cosine was preferred because of its simplicity. Notice also that this kernel is the one

used by the AC and CEP pitch estimators (see Chapter 2).

3.3 Warping of the Spectrum

As mentioned in Chapter 2, the use of the logarithm of the spectrum in an integral

transform is inconvenient because there may be regions of the spectrum with no energy, which



















r ~


1






1 =







-1


0.5 1






0. 1


0.5 1


0


0 0.5 1
Frequency (kHz)


Figure 3-5. Warping of the spectrum.


would prevent the evaluation of the integral, since the logarithm of zero is minus infinity. But

even if there is some small energy in those regions, the large absolute value of the logarithm

could make the effect of these low energy regions on the integral larger than the effect of the

regions with the most energy, which is certainly inconvenient.

To avoid this situation, the use of the logarithm of the spectrum was discarded and other

commonly used functions were explored: square, identity, and square-root. Figure 3-5 shows

how these functions warp the spectrum of the vowel /u/ used in Chapter 2. As mentioned earlier,

this spectrum has two particularities: it has a missing fundamental, and it has a salient second









harmonic. The missing fundamental is evident in panel B, which shows that the logarithm of the

spectrum in the region of 190 Hz is minus infinity. The salient second harmonic at 380 Hz shows

up clearly in the other three panels, but especially in panel C, where the spectrum has been

squared. Panel D shows the square-root of the spectrum, which neither overemphasizes the

missing fundamental (as the logarithm does) nor the salient second harmonic (as the square

does).

We believe the square-root warping of the spectrum is more convenient for three reasons.

First, it matches better the response of the auditory system to amplitude, which is close to a

power function with an exponent in the range 0.4-0.6 (see Chapter 2); second, it allows for a

weighting of the harmonics proportional to their amplitude, as we will show in the next section;

and third, it produces better pitch estimates, as found tests presented later.

3.4 Weighting of the Harmonics

To avoid the subharmonics problem presented in Chapter 2, a decaying weighting factor was

applied to the harmonics. The types of decays explored were exponential and harmonic. For

exponential decays, a weight of r k was applied to the k-th harmonic (k= 1, 2, ..., n, and

r = 0.9, 0.7, 0.5) through the multiplication of the kernel by the envelope r f-, as shown in

Figure 3-6. For harmonic decays, a weight of 1/k P was applied to the k-th harmonic

(k= 1, 2, ..., n, and p= 1/2, 1, 2) through the multiplication of the kernel by the envelope

(f/ f' '), as shown in Figure 3-6. In informal tests, the best results were obtained using harmonic

decays with p = 1/2, which matches the decay of the square-root of the average spectrum of

vowels (see Chapter 2). In other words, better pitch estimates were obtained when computing the

inner product (IP) of the square-root of the input spectrum and the square-root of the expected

spectrum, than when computing the IP's over the raw spectra.












Exponent al- r =0 9
1.8 Exponent al- r; = 0
Exponent al. r =0 5
1.6 ,) Harmonic. p; =O5
--- Harmonic. p= 1
1 A -_ Harmonlc- p 2



to 0.8C Harmonic: p=05E


0.4



0 100 200 300 400 500 600 700 800 900 1000
Frequency (kHz)

Figure 3-6. Weighting of the harmonics.




One explanation for this is that when the input spectrum matches its corresponding

template (i.e., the expected spectrum for that pitch), the use of the square-root of the spectra in

the IP gives to each harmonic a weight proportional to its amplitude. For example, if the input

spectrum has the expected shape for a vowel, i.e., the amplitude of the harmonics decay as 1, 1/2,

1/3, etc., then their square root decays as 1, 1/-\2, 1/-\3, etc. Since the terms in the sum of the IP

are the squares of these values (i.e., 1, 1/2, 1/3, etc.), then the relative contribution of each

harmonic is proportional to its amplitude. Conversely, if we compute the IP over the raw spectra,

the terms of the sum will be 1, 1/4, 1/9, etc., which are not proportional to the amplitude of the

components, but to their square. This would make the contribution of the strongest harmonics too

large and the contribution of the weakest too small. The situation would be even worse if we

would compute the IP over the energy of the spectrum (i.e., its square). The expected energy of

the harmonics for a vowel follows the pattern 1, 1/4, 1/16, etc., and computing the IP of the










energy of the harmonics with itself produces the terms 1, 1/16, 1/256, etc, which gives too much

weight to the first harmonic and almost no weight to the other harmonics.

In the ideal case in which there is a perfect match between the input and the template, any

of the previous types of warping would produce the same result: a normalized inner product

(NIP) equal to 1. However, the likelihood of a perfect match is low, and the warping may play a

big role in the determination of the best match, as we found in informal tests, which show that

the use of the square-root of the spectrum produces better pitch estimates.

3.5 Number of Harmonics

An important issue is the number of harmonics to be used to analyze the pitch. HPS, SHS,

SHR, and HS use a Eixed Einite number of harmonics, and CEP and AC use all the available

harmonics (i.e., as many as the sampling frequency allows). In informal tests the best results

were obtained when using as many harmonics as available, although it was found that going

beyond 3.3 k
significantly. Thus, to reduce computational cost it is reasonable to set these limits.

3.6 Warping of the Frequency Scale

As mentioned in Section 3.4, if the input matches perfectly any of the templates, their NIP

will be equal to 1, regardless of the type of warping used on the spectrum. The same applies to

the frequency scale. However, since a perfect match will rarely occur, a warping of the frequency

scale may play a role in determining the best match.

For the purposes of computing the integral of a function, we can think of a warping of the

scale as the process of sampling the function more Einely in some regions than others, effectively

giving more emphasis to the more Einely sampled regions. In our case, since we are computing

an inner product to estimate pitch, it makes sense to sample the spectrum more Einely in the

region that contributes the most to the determination of pitch. It seems reasonable to assume that









this region is the one with the most harmonic energy. In the case of speech, and assuming that

the amplitude of the harmonics decays inversely proportional to frequency, it seems reasonable

to sample the spectrum more finely in the neighborhood of the fundamental and decrease the

granularity as we move up in frequency, following the expected 1/fpattern for the amplitude of

the harmonics. A decrease in granularity should also be performed below the fundamental

because no harmonic energy is expected below it. However, the determination of the frequency

at which this decrease should begin is non-trivial, since we do not know a-priori the fundamental

frequency of the incoming sound (that is precisely what we wish to determine).

As we did for the selection of the warping of the amplitude of the spectrum, we appeal to

the auditory system and borrow the frequency scale it seems to use: the ERB scale (see

Chapter 1). Therefore, to compute the similarity between the input spectrum and the template,

we sample both of them uniformly in the ERB scale, whose formula is given in Equation 1-8.

This scale has several of the characteristics we desire (see Figure 1-9): it has a logarithmic

behavior as increases, tends toward a constant as decreases, and the frequency at which the

transition occurs (229 Hz) is close to the mean fundamental frequency of speech, at least for

females (Bagshaw, 1994; Wang and Lin, 2004; Schwartz and Purves, 2004). It does not produce

a decrease of granularity as approaches zero, but at least does not increase without bound either,

as a pure logarithmic scale does.

The convenience of the use of the ERB scale for pitch estimation over the Hertz and

logarithmic scales was confirmed in informal tests, since better results were obtained when using

the ERB scale. Two other common psychoacoustic scales, the Mel and Bark scales, were also

explored, but they produced worse results than the ERB scale.









3.7 Window Type and Size

Along this chapter we have been mentioning our wish to obtain a perfect match (i.e., NIP

equal to 1) between the input spectrum and the template corresponding to the pitch of the input.

This section deals with the feasibility of achieving such goal.

First of all, since the input is non-negative but the template has negative regions, a perfect

match is impossible. One solution would be to set the negative part of the template to zero, but

this would leave us without the useful property that the negative weights have: the production of

low scores for noisy signals (see Section 2.3). Instead, the solution we adopt is to preserve the

negative weights, but ignore them when computing the norm of the template. In other words, we

normalize the kernel using only the norm of its positive part

K' (f )= max(0, K(f )) (3-6)

Hereafter, we will refer to this normalization as F-normalization.

To obtain a Kt-normalized inner product (Kt-NIP) close to 1, we must direct our efforts to

make the shape of the spectral peaks match the shape of the positive cosine lobe used as base

element of the template, and also to force the template have a value of zero in the negative part

of the cosine. Since the shape of the spectral peaks is the same for all peaks, it is enough to

concentrate our efforts on one of them, and for simplicity we will do it for the peak at zero

frequency.

The shape of the spectral peaks is determined by the type of window used to examine the

signal. The most straightforward window is the rectangular window, which literally acts like a

window: it allows seeing the signal inside the window but not outside it. More formally, the

rectangular window multiplies the signal by a rectangular function of the form














0.75



0.25






-5/T -4KT -3KT -2/T -1/T 0 17T 27T 3/T 4/T 5rT
Frequency (Hz)


Figure 3-7. Fourier transform of rectangular window.



1/T ,if It|I 0, (t) =(3-7)
iti~0 otherwise,

where Tis the window size.

If a rectangular window is used to extract a segment of a sinusoid of frequency f Hz to

compute its Fourier transform, the support of this transform will not be concentrated at a single

point but will be smeared in the neighborhood of f This effect is shown in Figure 3-7 for f= 0, in

other words, the figure shows the Fourier transform of Hr (t). This transform can be written as

sinc(Tf~ ), where the since function is defined as

sin( )~
sinc(q$)= (3-8)


This function consists of a main lobe centered at zero and small side lobes that extend towards

both sides of zero. For any other value of f its Fourier transform is just a shifted version of this

function, centered at/:
















0.5








-fi -l40 f4 fi2
Frequency (Hz)

Figure 3-8. Cosine lobe and square-root of the spectrum of rectangular window.



Since the height of the side lobes is small compared to the height of the main lobe, the

most obvious approach to try to maximize the match between the input and the template is to

match the width of the main lobe, 2/T, to the width of the cosine lobe, f/2, and solve for the free

variable T. This produces an "optimal" window size, hereafter denoted T*, equal to T= 4/f:

Figure 3-8 shows the square-root of the spectrum of a rectangular window of size T = T* = 4/f and

a cosine with period f (i.e., the template used to recognize a pitch of f Hz). The Kt-NIP of the

main lobe of the spectrum and the cosine positive lobe (i.e., from -f/4 to f/4) sampled at 128

equidistant points is 0.9925, which seems satisfactorily high. However, the Kt-NIP computed

over the whole period of the cosine (i.e., from -f/2 to f/2) sampled at 128 equidistant points is

only 0.5236, which is not very high. This low Kt-NIP is caused by the relatively large side lobes,

which reach a height of almost 0.5.






























































3 This time-frequency relation may not be obvious at first sight, but it can be shown using Fourier analysis.


A window with much smaller side lobes is the Hann window. The shorter side lobes are

achieved by attenuating the time-domain window down towards zero at the edgeS3. The formula

for this window is


h7(t)=11+ cos ,I] (3-9)


where T is the window size (i.e., the size of its support). This window is simply one period of a

raised cosine centered at zero, as illustrated in Figure 3-9.

The Fourier transform of a Hann window of size T is


1 1
H, ( f) = sinc(Tf) + -sinc(Tf 1) + -sinc(Tf + 1),
2 2


(3-10)


a sum of three since functions, as illustrated in Figure 3-10. The width of the main lobe of this

transform is 4/T, twice as large as the main lobe of the spectrum of the rectangular window.


O
Time (s)

Figure 3-9. Hann window.













H(T,f)

sinc(Trf)


sinc(TT+1l) sinc(Tf-1)


-0-3rT -2T -1T 0 1T
Frequency : Hz)


2r 3r


Figure 3-10. Fourier transform of the Hann window. The FT of the Hann window consists of a
sum of three since functions.




Equalizing this width to the width of the cosine lobe, f/2, and solving for T, we obtain an optimal


window size of T*


---Squaret-root spectrum of Hann-win
-- Cosine


0
Fnrquency Hz)


fl4 f/2


Figure 3-11i. Cosine lobe and square-root of the spectrum of Hann window.










Figure 3-11 shows the square-root of the spectr-um of a Hann window of size T= T* = 8/f

and a cosine with period f: The similarity between the main lobe and the positive lobe of the

cosine is remarkable. Using Equations 3-8 and 3-10 it can be shown that they match at 5 points:

0, +/- f/8, and +/- f/4, with values cos(0) = 1, cos(n/4) = 1/-\2, and cos(n/2) = 0, respectively. The

f-NIP of the main lobe of the spectrum and the positive part of the cosine sampled at 128

equidistant points is 0.9996, and the Kt-NIP computed over the whole period of the cosine

sampled at 256 equidistant points is 0.8896, much larger than the one obtained with the

rectangular window.

The same approach can be used to obtain the optimal window size for other window types.

For the most common window types used in signal processing, it can be shown that the width of

the main lobe is 2k/T, where the parameter k depends on the window type (see Oppenheim,

Schafer, and Buck, 1999) and is tabulated in Table 3-1. For these windows, the optimal window



Table 3-1. Common windows used in signal processing*
Kt-NIP
Window type k Positive lobe Whole period
Bartlett 2 0.9984 0.7959
Bartlett-Hann 2 0.9995 0.8820
Blackman 3 0.9899 0.9570
Blackman-Harris 4 0.9738 0.9689
Bohman 3 0.9926 0.9474
Flat top 5 0.9896 0.9726
Gauss 3.14 0.9633 0.8744
Hamming 2 0.9993 0.9265
Hann 2 0.9996 0.8896
Nuttall 4 0.9718 0.9682
Parzen 4 0.9627 0.9257
Rectangular 1 0.9925 0.5236
Triangular 2 0.9980 0.8820
* The IC-NIP values were computed using 128 equidistant samples for the positive lobe and 256 equidistant
samples for the whole period.









size to analyze a signal with pitch fHz can be obtained by equalizing 2k/T to the width of the

cosine lobe, f/2, to produce T* = T= 4k/f:

Table 3-1 also shows the Kt-NIPs between the square-root of the spectrum and the cosine

computed over the positive lobe of the cosine (from -f/4 to f/4) and over the whole period of the

cosine (from -f/2 to f/2). The window that produces the largest K -NIP over the whole period is

the flat-top window. However, its size is so large compared to other windows that the increase in

Kt-NIP is probably not worth the increase in computational cost; similar results are obtained

with the Blackman-Harris window, which is 4/5 its size. If computational cost is a serious issue,

a good compromise is offered by the Hamming window, which requires half the size of the

Blackman-Harris window, and produces a f-NIP of about 0.93. This f-NIP is larger than the

one produced by the Hann window, with no increased computational cost (k-2 in both cases).

However, since the difference in performance between them is not large, we prefer the

analytically simpler Hann window.

3.8 SWIPE

Putting all the previous sections together, the SWIPE estimate of the pitch at time t can be

formulated as

ERBs(f, )
SK( f, q(s)) | Xt, f (s))|1/2 de
p(t)= argmax o )12 (3-11)
SERBs(f,,) ERBs(f,,)



where

cos(27tf'/f) ,if 3/4 < f'/f < n(f)+1/4,

K(f f ')= 1- cos(27rf'/ f ) ,if 1/4 < f '/f < 3/4 or n(f )+1/4 < f '/f < n(f) )+3/4, (-2

0 ,otherwise,











X(t, f,f')= jw-k/Jr xe_' d -ydfdt, (3-13)


E is frequency in ERBs, 9 (-) converts frequency from ERBs into Hertz, ERBs(-) converts

frequency from Hertz into ERBs, K (-) is the positive part of K(-) {i.e., max[0, K(-)]}, fmax is the

maximum frequency to be used (typically the Nyquist frequency, although 5 k
most applications), n(f)= L fmax/lf-3/4 and we .-(t) is one of the window functions in

Table 3-1, with size 4k/f: The kernel corresponding to a candidate with frequency 190 Hz is

shown in Figure 3-12. Panel A shows the kernel in the Hertz scale and Panel B in the ERB scale,

the scale used to compute the integral.

Although the initial approach of measuring a smooth average peak to valley distance has

been used everywhere in this chapter, we can make a more precise description of the algorithm.




0.2
Spectrum
E~- Kernel




0 0.5 1 1.5 2
Frequency (Hz)
0.2
Spectrum
: -- Kernel




0 5 10 15 20
Frequency (ERBs)

Figure 3-12. SWIPE kernel. A) The SWIPE kernel consists of a cosine that decays as 1/J; with a
truncated DC lobe and halved first and last negative lobes. B) SWIPE kernel in the
ERB scale.









It can be described as the computation of the similarity between the square-root of the spectrum

of the signal and the square-root of the spectrum of a sawtooth waveform, using a pitch-

dependant optimal window size. This description gave rise to the name Sawtooth-Waveform

Inspired Pitch Estimator (SWIPE).

3.9 SWIPE'

So far in this chapter we have concentrated our efforts on maximizing the similarity

between the input and the desired template, but we have not done anything explicitly to reduce

the similarity between the input and the other templates, which will be the goal of this section.

The first fact we want to mention is that most of the mistakes that pitch estimators make,

including SWIPE, are not random: they consist of estimations of the pitch as multiples or

submultiples of the pitch. Therefore, a good source of error to attack is the score (pitch strength)

of these candidates.

A good feature to reduce supraharmonic errors is to use negative weights between

harmonics. When analyzing a pitch candidate, if there is energy between any pair of consecutive

harmonics of the candidate, this suggests that the pitch, if any, is a lower candidate. This idea is

implemented by the negative weights, which reduce the score of the candidate if there is any

energy between its harmonics. This feature is used by algorithms like SHR, AC, CEP, and

SWIPE.

The effect of negative weights on supraharmonics of the pitch is illustrated in

Figure 3-13A. It shows the spectrum of a signal with fundamental at 100 Hz and all its

harmonics at the same amplitude (vertical lines). (Only harmonics up to 1 k
signal contains harmonics up to 5 k
visualization, but in general they will be wider, with a width that depends on the window size.


















0 100 20 300 400 500 600 70 800 900 1000

- : .: _~i _I ,I_ __


-


1

1
0.5
0


1
I 05
+0


-0.5
1
-0.5
II
~L 0
+
-0.5


-


50 100 200


Frequency (Hz)


Figure 3-13. Most common pitch estimation errors. A) Harmonic signal with 100 Hz
fundamental frequency and all the harmonics at the same amplitude, and 200 Hz
kernel with positive (continuous lines) and negative (dashed lines) cosine lobes. B)
Same signal and 50 Hz kernel. C) Scores using only positive cosine lobes (exhibits
peaks at sub and supraharmonics). D) Scores using both positive and negative cosine
lobes (exhibits peaks at subharmonics). E) Scores using both positive and negative
cosine lobes at the first and prime harmonics (exhibits a maj or peaks only at the
fundamental)



Panel A also shows the positive cosine lobes (continuous curves) used to recognize a pitch of

200 Hz and the negative cosine lobes that reside in between (dashed curves). The positive cosine

lobes at the harmonics of 200 Hz produce a positive contribution towards the score of the 200 Hz

candidate, but the negative cosine lobes at the odd multiples of 100 Hz cancel out this

contribution. Panel C shows the score for each pitch candidate using as kernel only the positive









cosine lobes, whereas Panel D shows the scores using both the positive and the negative cosine

lobes. The effect on the 200 Hz peak is definite: it has disappeared. The same effect is obtained

for higher order multiples of 100 Hz (not shown in the figure).

To reduce subharmonic errors, two techniques were presented in Chapter 2: the use of a

decaying weighting factor for the harmonics, and the use of a bias to penalize the selection of

low frequency candidates. The former is used by SHS and SWIPE, and the latter by AC.

Although these techniques have an effect in reducing the score of subharmonics, significant

peaks are nevertheless present at submultiples of the pitch, as shown in Figure 3-13D.

To further reduce the height of the peaks at subharmonics of the pitch we propose to

remove from the kernel the lobes located at non-prime harmonics, except the lobe at the first

harmonic. Figure 3-13B helps to show the intuition behind this idea. This figure shows the same

spectrum as in Figure 3-13A and the kernel corresponding to the 50 Hz candidate. This kernel

has positive lobes at each multiple of 50 Hz and therefore at each multiple of 100 Hz, producing

a high score for the 50 Hz candidate, as shown in Panel D. Notice that this candidate gets all of

its credit from its 2nd, 4th, 6th, etc., harmonics, i.e., 100 Hz, 200 Hz, 300 Hz, etc., frequencies that

suggest a fundamental frequency (and pitch) of 100 Hz. The same situation occurs with the

candidate at 33 Hz (kernel not shown), but in this case its credit comes from its 3rd, 6th, 9th, etc.,

harmonics.

If we use only the first and prime lobes of the kernel, the candidates at subharmonics of

100 Hz would get credit only from their harmonic at 100 Hz, but not from any other. In general,

it can be shown that with this approach, no candidate below 100 Hz can get credit from more

than one of the harmonics of 100 Hz. In other words, if there is a match between one of the

prime harmonics of this candidate and a harmonic of 100 Hz, no other prime harmonic of the









candidate can match another harmonic of 100 Hz, and therefore the score of all the candidates

below 100 Hz has to be low compared to the score of the 100 Hz candidate. This effect is evident

in Figure 3-13E, which shows the scores of the pitch candidates when using only their first and

prime harmonics. Certainly, there are peaks below 100 Hz, but they are relatively small

compared to the peak at 100 Hz. Contrast this with Panels C and D, where the score of 50 Hz is

relatively high, and therefore the risk of selecting this candidate is high.

An extra step needs to be done to avoid bias in the scores. Remember from the beginning

of this chapter that the central idea of SWIPE was to compute the average peak-to valley distance

at harmonic locations in the spectrum. When computing this average for a single peak, the

weight of the peak was twice as large as the weight of its valleys, as expressed in Equation 3-1.

Since the global average is the average of this equation over all the peaks, and since each valley

is associated to two peaks too, the weight of the valleys, except the first and the last ones, was

the same as the weight of the peaks, as expressed in Equation 3-2. However, if we use only the

first and prime harmonics, the weight of the valleys will not be necessarily -1, but will depend on

whether the valleys are between the first or prime harmonics. The only valleys with a weight

of-1 will be the valley between the first and second harmonics, and the valley between the

second and third harmonics; all the other valleys will have a weight of -1/2, before applying the

decaying weighting factor, of course.

This variation of SWIPE in which only the first and prime harmonics are used to estimate

the pitch will be denominated SWIPE' (read SWIPE prime). Its kernel is defined as


K(f, f')= CK,(f, f'), (3-14)
76(1) P

where P is the set of prime numbers, and











SSpectrumerl









0' 5i 10 15 20 2



Frequency (ERBs)

Figure 3-14. SWIPE' kernel. Similar to the SWIPE kernel but includes only the first and prime
harmonics.




cos(2x ~f'/ f) ,if | f'/ f i |< 1/4,

K, (f, f')= 1cos(2x ~f'/ f) ,if 1/4 < | f' f i |< 3/4, (3-1_5)

0 otherwise.

Notice that the SWIPE kernel can also be written as in Equation 3-14, by including all the

harmonics in the sum. The SWIPE' kernel corresponding to a pitch candidate of 190 Hz (5.6

ERBs) is shown in Figure 3-14. The numbers on top of the peaks show the harmonic number

they correspond to.

3.9.1 Pitch Strength of a Sawtooth Waveform

Since the template used by SWIPE' has peaks only at the first and prime harmonics, a

perfect match between the template and the spectrum of a sawtooth waveform is impossible

(unless fmax is so small relative to the pitch that the template contains no more than three













312 Hz I


1625 Hz


0.9 iA 0.9 i




r 0.8 *0.8
co 1 3 7 15 31 63 127 255 1 3 7 15 31 63 127 255
---SWIPE
~- SWIPE'

.5 H 78.1 Hz



0.9 C 0.9 D




0.8 0.8
1 3 7 15 31 63 127 255 1 3 7 15 31 63 127 255

Number of harmonics


Figure 3-15. Pitch strength of sawtooth waveform. A) 625 Hz. B) 312 Hz. C) 156 Hz.
D) 78.1 Hz.




harmonics). Therefore, it would be interesting to analyze the f-NIP between the spectrum and

the template as a function of the number of harmonics. Figure 3-15 shows the pitch strength (Kt-

NIP) obtained using SWIPE and SWIPE' for different pitches and different number of

harmonics. The pitches shown are 625, 312, 156, and 78.1 Hz. They were chosen because their

optimal window sizes are powers of two for the sampling rates used: 2.5, 5, 10, 20, and 40 k
In each case, fmax was set to the Nyquist frequency.

The pitch strength estimates produced by SWIPE are larger than the ones produced by

SWIPE', except when the number of harmonics is less than four, in which case both algorithms

use all the harmonics. The pitch strength estimates produced by SWIPE in Figure 3-15 have a










mean of 0.93 and a variance of 5.1x10 This mean is significantly larger than the Kt-NIP

reported in Table 3-1 for the Hann window. The reason of the mismatch is that the granularity

used to produce the data in Table 3-1 and the data in Figure 3-15 is different. The f-NIP values

in Table 3-1 are based on a sampling of 128 points per spectral lobe, while the data in Figure 3-

15 is based on a sampling of 10 points per ERB, which depending on the pitch and the harmonic

being sampled, may correspond to a range of about 0 to 40 points per spectral lobe.

On the other hand, the mean of the pitch strength estimates produced by SWIPE' is 0.87

and the variance is 1.0x10-3. The smaller mean is expected since the template of SWIPE'

includes only the first and prime harmonics, while a sawtooth waveform has energy at each of its

harmonics. The larger variance is also expected since the prime numbers become sparser as they

become larger, causing a reduction in the similarity of the template and the spectrum of the

sawtooth waveform as the number of harmonics increases.

It would be useful to have a lower bound for the pitch strength estimates produced by

SWIPE', but an analytical formulation for it is intractable. However, the data in Figure 3-15,

which is representative of a wide range of pitches and number of harmonics, suggests that the

pitch strength produced by SWIPE' for a sawtooth waveform does not go below 0.8.

3.10 Reducing Computational Cost

3.10.1 Reducing the Number of Fourier Transforms

The computation of Fourier transforms is one of the most computationally expensive

operations of SWIPE and SWIPE'. Therefore, to reduce computational cost it is important to

reduce the number of Fourier transforms. There are two strategies to achieve this: to reduce the

window overlap and to share Fourier transforms among several candidates.










3.10.1.1 Reducing window overlap

The most common windows used in signal processing are the ones that are attenuated

towards zero at their edges (e.g., Hann and Hamming windows). A disadvantage of this

attenuation is that it is possible to overlook short events if these events are located at the edges of

the windows. To avoid this situation, it is common to use overlapping windows, which increases

the coverage of the signal, at the cost of an increase in computation. However, after a certain

point, overlapping windows start to produce redundancy in the analysis, without adding any

significant benefit. The goal of this section is to propose a schema obtain a good balance

between signal coverage and computational cost.

As mentioned in Section 1.1.4, depending on frequency, a minimum of two to four cycles

are necessary to perceive the pitch of a pure tone. Based on the similarity of the data used to

arrive to this conclusion and data obtained using musical instruments, it is reasonable to assume

that these results are applicable to more general waveforms, in particular, to sawtooth

waveforms. To avoid the interaction between the number of cycles and pitch, for purposes of the

algorithm, we set the minimum number of cycles necessary to determine pitch to four, the

maximum among the minimum number of cycles required over all frequencies.

Since SWIPE and SWIPE' are designed to produce maximum pitch strength for a sawtooth

waveform4 and zero pitch strength for a flat spectrmm', a natural choice to decide whether a

sound has pitch is to use as threshold half the pitch strength of a sawtooth waveform. (In Section

3.9.1 it was found that the pitch strength of a sawtooth waveform is about 0.93 for SWIPE and

between 0.83 and 0.93 for SWIPE'.) To make these algorithms produce maximum pitch strength,


4 In fact, SWIPE' produces maximum pitch strength for sawtooth waveforms with the non-prime harmonics
removed (except the first one), but we believe this type of signal is unlikely to occur in nature.

5 The pitch strength of a flat spectrum is in fact negative because of the decaying kernel envelope.










a perfect match between the kernel and the spectrum of the signal is necessary, which requires

that the window contains eight cycles of the sawtooth waveform, when using a Hann window. If

the signal contains exactly eight cycles (i.e., if it is zero outside the window) and is shifted

slightly with respect to the window, the pitch strength decreases, and it reaches a limit of zero

when the signal gets completely out of the window. Although hard to show analytically, it is easy

to show numerically that that the relation between the shift and pitch strength is linear.

Therefore, if the window contains four or more cycles of the sawtooth waveform, the pitch

strength is at least half the maximum attainable pitch strength (i.e., the one achieved when the

window is full of the sawtooth waveform), and if the window contains less than four cycles of

the sawtooth waveform, the pitch strength is less than half the maximum attainable pitch

strength.






















Figure 3-16. Windows overlapping.

Object 3-2. Four cycles of a 100 Hz sawtooth waveform (WAV file, 2 KB)









Therefore, if we determine the existence of pitch based on a pitch strength threshold equal

to half the maximum attainable pitch strength, to determine as pitched a signal consisting of four

cycles of a sawtooth waveform, we need to ensure that there exists at least one window whose

coverage includes the whole signal. It is straightforward to show that to achieve this goal, we

need to distribute the windows such that their separation in no larger than four cycles of the pitch

period of the signal. In other words, the windows must overlap by at least 50%.

This situation is illustrated in Figure 3-16, which shows a signal consisting of four cycles

of a sawtooth waveform (listen to Object 3-2) and two Hann windows centered at the beginning

and the end of the signal. The windows are separated at a distance of four cycles, and the support

of each of them overlaps with the whole signal, making it possible for each window to reach the

pitch strength threshold. If the signal is slightly shifted in any direction, one of the windows will

cover less than four periods, but the other will cover the four periods.

This would not be true if the separation of the windows is larger than four cycles. If the

support of one of the windows overlaps completely with the signal but the separation of the

windows is larger than four cycles, the other window will not cover the signal completely, and

therefore a small shift of the signal towards the latter window would not necessarily put the

whole signal inside the window, making it impossible for any of the windows to produce a pitch

strength larger than the threshold.

3.12.1.2 Using only power-of-two window sizes

There is a problem with the optimal window size (O-WS) proposed in Section 3.7: each

pitch candidate has its own, which means that a different STFT must be computed for each

candidate. If we separate the candidates at a distance of 1/8 semitone over a range of 5 octaves

(appropriate for music, for example), we will need to compute 8*12*5 = 480 STFTs for each










pitch estimate. Not only that, for some WSs it may be inefficient to use an FFT (recall that the

FFT is more efficient for windows sizes that are powers of two).

To alleviate this problem, we propose to substitute the O-WS with the power-of-two (P2)

WS that produces the maximum f-NIP between the square-root of the main lobe of the

spectrum and the cosine kernel. To find such a WS, it is convenient to have a closed-form

formula for the f-NIP of these functions, but this involves integrating the product of a cosine

and the square-root of the sum of three since functions, which is analytically intractable.

As an alternative, we approximate the square-root of the spectral lobe with an idealized

spectral lobe (ISL) consisting of the function it approximates: a positive cosine lobe. Figure 3-17

shows a f-normalized cosine whose positive part has a width of f/2 (i.e., the cosine template

used by an fHz pitch candidate), and two normalized ISLs whose widths are half and twice the

width of the positive part of the cosine. Since the cosine and the ISLs are symmetric around zero,

the Kt-NIP can be computed using only the positive frequencies. Hence, the Kt-NIP





SK -normalized cos 27Eflf (Template)
---- K -normalized -os 4xflf (|f| K -normalized cos /f# (|lf|










-fl2 -3fl8 -1/4 -fl8 0 f/8 f4 3f/8 fl2
Frequency Hz)

Figure 3-17. Idealized spectral lobes.









of the central positive lobe of a cosine with period rf (the ISL) and a cosine with period f (the

template) can be computed as
f/4r
Scos(2irf'/ f) cos(2rf'/ f) df'
P(r) = 0 /
f/4r f/4
cos2(2nrf'/ f) df' cos2(2xf'/ f) df'1

f/4r
cos ,,[2,(1+,,r) f'/ fcos[2n(1- r) f'/ f] df'

[f/8rj1/2 //8]1/2


2ZF sin[2K(1+r) f'/ f ] sin[2K(1-r) f '/f] f'f4f
1+r 1-r


2- sin@(1 +r)/2r] sin @(1-r) /2r]
171+ II (3-16)

It is convenient to transform the input of this function to a base-2 logarithmic scale,

Ai = log2(r), and then redefine the function as

21+1/2 Sin(2 +1)r/22 si[(2" -1)r1/2 (-7
H(A) = + .(-7
xi 1+2 1-

Figure 3-18A shows H(il) for ii between -1 and 1 (i.e., r = 2" between 1/2 and 2). As ii departs

from zero, H(il) departs from 1, as expected. However, the distribution is not symmetric: a

decrease in ii has a larger effect on H(il) than an increase in ii. This make sense since a decrease

in ii corresponds to a widening of the ISL, which puts part of it in the region where the cosine

template is negative (see wider ISL in Figure 3-1), producing a large decrease in H(il). On the

other hand, narrowing the ISL keeps it in the positive region of the cosine template, producing a

smaller decrease in H(il).

















0.6





-0.2
01 0. 03 040.5 0 07 0 0. 0. 1


Fiue -8.fnomlie inrprdctbtwe tmlaead dalzd pcta lbs

Fiue31Acnb hlfli idn heP-Sta rdce h ags -I

bewe heILad h eplt.I teOW frtetepaeisTscnd n hesmln












shows te diffe~nrencied betwen r anod ctI(-1 safnto f2 o between 0eplt a ielz sctand 1.From



tefigure wecan ianfe that foru A'eten 0andin 0.56 P2we should use the larger t P2-S, ndfo

Between 0.56IS and 1,e we shold use the smaller P2-WS Howevaer Figur ecd 3-18 shows amlso ha

rthee isf, hnotmuhe loss in thele f-IP by chosn 0.5 asic threshold rthe thaO n 0.56 Theurefore, to




simplify sthealgorihwe ecdd to seo ht the toi' thrsodat 0.5. Ipnd other wors tlodestermine the











~_ I+ Larger WS
r I oSmaller WS


0 A



160 180 200 220 240 260 280

.t + Larger WS
3 1I oSmaller WS
CombinedB





160 180 200 220 240 260 280
Frequency (Hz)

Figure 3-19. Individual and combined pitch strength curves.



P2-WS to use for a pitch candidate, we transform the O-WS and the P2-WSs to a logarithmic

scale, and choose the P2-WS closest to the optimal.

Unfortunately, this approach produces discontinuities in the pitch strength (PS) curves, as

illustrated in Figure 3-19A. The PS values marked with a plus sign were produced using a WS

larger than the WS than the ones marked with a circle. To emphasize the effect, the pitch of the

signal (220 Hz) was chosen to match the point at which the change of WS occurs. Since the PS

values produced by the larger window in the neighborhood of the pitch are larger than the ones

produced by the smaller window, the pitch could be biased toward a lower value.

Although an effort was made to find an appropriate value for the threshold, it was based on

an idealized spectrum, which does not have the side lobes found in real spectra. This problem

can be alleviated by using a threshold larger than 0.56, determined through trial and error, but we










found a better solution: to compute the PS as a linear combination of the PS values produced by

the two closest P2-WSs, where the coefficients of the combination are proportional to the log-

distance between the P2-WSs and the O-WS.

Concretely, to determine the P2-WSs used to compute the PS of a candidate with

frequency f Hz, the O-WS is written as a power of two, N* = 2L+h, where L is an integer and

0< A i<1. Then, the PS values So~f) and S1Cf) are computed using windows of size 2L and 2L+1,

respectively. Finally, these PSs are combined into a single one to produce the final PS

S(f) = (1- A) Sn (f) + Ai S, (f) (3-18)

Figure 3-19B shows how this combination of PS curves smoothes the discontinuity found in

Figure 3-19A.

It would be interesting to know how much is lost in PS by using the formula proposed in

Equation 3-18, when the O-WS is not a power of two. This lost can be approximated by finding












0.~94




0.3 '5

U 0.2 0.4 0.6 0.8


Figure 3-20. Pitch strength loss when using suboptimal window sizes.









the minimum of the linear combination (1-ii) TI(A) + Ai T(Al-1) for 0 < Ai < 1, which is plotted in

Figure 3-20. It can be seen that it has a minimum of 0.93 at around Al= 0.4. Therefore, the

maximum loss when computing PS using the two closest P2-WSs is 7%. Since the minimum PS

of a sawtooth waveform when using an O-WS is about 0.92 for SWIPE and 0.83 for SWIPE'

(see Figure 3-15), the minimum pitch strength of a sawtooth waveform when using the two

closest P2-WSs is about 0.86 for SWIPE and 0.77 for SWIPE'.

Besides using an optimal window size for the FFT computation, the approximation of

O-WSs using P2-WSs has another advantage that is probably more important: the same FFT can

be shared by several pitch candidates, more precisely, by all the candidates within an octave of

the optimal pitch for that FFT. Going back to the example that started this section, the

replacement of the O-WS with the closest P2-WSs reduces the number of FFTs required to

estimate the pitch from 480 to just 5: a huge save in computation.

Using this approach, and translating the algorithm to a discrete-time domain (necessary to

compute an FFT), we can write the SWIPE' estimate of the pitch at the discrete-time index r as

p[r]= argmax (1-Al(f)) SL(f)(r, f) + Al(f) SL(f)+1(r, f), (3-19)


where

A( f) = L*( f) L( f), (3 -20)

L( f )= L'(f ) (3-21)

L*(f)= log,(4kf,/ f), (3 -22)












SL 1z f 1 ,(3-23)

r ERBs( fmax)1 K+f ~a> ERBs( fmx)1



X, [r f '] =I( (0,...,N -1), X [r, (0,...,N -1) ], f 'N / f, ), (3-24)


X,[r, C]= w,'[r'-r] xC[r] e"' "~, (3-25)


As is the ERB scale step size (0.1 gives good enough resolution), I(@,E,#) is an interpolating

function that uses the functional relations Ek = F(O'k) to predict the value of F(#), and XN[r,p]l

(p= 0, 1,..., N-1) is the discrete Fourier transform (computed via FFT) of the discrete signal

x[r'], multiplied by the size-N windowing function wN[r'], centered at z. The other variables,

constants, and functions are defined as before (see Section 3.8). A Matlab implementation of this

algorithm is given in Appendix A.

3.10.2 Reducing the Number of Spectral Integral Transforms

The pitch resolution of SWIPE and SWIPE' depends on the granularity of the pitch

candidates. Therefore, to achieve high pitch resolution, a large number of pitch candidates must

be used, and since the pitch strength of each candidate is determined by computing a Kt-NIP

between its kernel and the spectrum, the computational cost of the algorithm would increase

enormously. To avoid this situation, we propose to compute Kt-NIPs only for certain candidates,

and then use interpolation to estimate the pitch strength of the other candidates.

As noted by de Cheveigne (2002), the AC of a signal is the Fourier transform of its power

spectrum, and therefore the AC is a sum of cosines that can be approximated around zero by

using a Taylor series expansion with even powers. If the signal is periodic, its AC is also










periodic, and therefore the shape of the AC around the pitch period is the same as the shape

around zero, and therefore it can also be approximated by the same Taylor series, centered at the

pitch period. If the width of the spectral lobes is narrow and the energy of the high frequency

components is small, the terms of order 4 in the series vanish as the independent variable

approaches the pitch period, and therefore the series can be approximated using a parabola.

Since SWIPE perform an inner product between the spectrum and a kernel consisting of

cosine lobes, a similar argument can be applied to the pitch strength curves produce by SWIPE.

However, the quality of the fit of a parabola is not guaranteed for two reasons: first, the width of

the spectral lobes produced by SWIPE are not narrow, in fact, they are as wide as the positive

lobes of the cosine; and second, the use of the square-root of the spectrum rather than its energy

makes the contribution of the high frequency components large, violating the requirement of low

contribution of high frequency components. Nevertheless, parabolic interpolation produces a

good fit to the pitch strength curve in the neighborhood of the SWIPE peaks, as we will proceed

to show.

Let's derive an approximation to the pitch strength curve cr(t) produced by SWIPE for a

sawtooth waveform with fundamental frequency fo = 1/To Hz in the neighborhood of the pitch

period To. To simplify the equations, let's define the scaling transformations ro~= 27 and

r= 2xit/To. To make the calculations tractable, let's use idealized spectral lobes (i.e. cosine lobes)

and let's ignore the normalization factors and the change of width of the spectral lobe with

change of window size caused by a change of pitch candidate. Let's also replace the continuous

decaying envelope of the kernel with a decaying step function that gives a weight of 1/-\k to the

k-th harmonic. With all this simplifications, the pitch strength of a candidate with scaled pitch









period r in the neighborhood of 2xi (i.e., when the non-scaled pitch period t is in the

neighborhood of To) can be approximated as


ocr)= [akcr), (3-26)
k=1

where

k+1 4
k1/
k1/

21 Iicos[(t-2n)w]+cos[(t+2n)i] del~
2k /


1k sin[(t -2)il]+ sin[(t + 2r)o] C

2kt-2xi t+2xi


21 sin[(k +1/ 4)(t 2x)~- sin[(k +1/ 4)(t i1-2 x)



sin[(k +1/ 4)(t +2x)]~- sin[(k +1/ 4)(t +2x)~t2i1(-7


Since we are interested in approximating this function in the neighborhood of 2x,: we can

equivalently shift the function 2xi units to the left by defining dk F) Gk(r+ 2x),) and then

approximate dk (r) in the neighborhood of zero. Since sin(x) / x = 1 x2/3!i + x4/5!i O(x6) in the

neighborhood of zero, it is useful to express dk(T) aS

k +1/4 sin[(k +1/4)r] k-1/4 sin[(k-1/4)r]
2k (k +1/ 4)r 2k (k -1/ 4)r

1 sin [k 1/ 4)r]- sin [k + 1/ 4)r]
+ -(3-28)
2k + 4xi

which has the Taylor series expansion











k+1/ (k 1/) (k 14)
2k 3! 5!




2k 3! 5!


(3-29)


in the neighborhood of zero. Finally, the approximation of the pitch strength curve in the shifted-

time domain is


k=1


(3-30)


0.8



0.4


C
1=

a,
m
a,









Figure 3-21.


0.2I




-0.

-0.

-0.6
1 3 57
Number of harmonics


Coefficients of the pitch strength interpolation polynomial.


1
+ [(k -1/4)r -
2k(r + 47)


(k -1/ 4) ()
3!


(k +1/4)r- rk +O/4r ) u
3!


illl
ii I
~ 111
111










Figure 3-21 shows the relative value of the coefficients of the expansion as a function of the

number of harmonics in the signal. As the number of harmonics increases, the relative weight of

the order-4 coefficient increases. However, as r approaches zero, its fourth power becomes so

small that its overall contribution to the sum is small compared to the contribution of the order-2

term.

This effect is clear in Figure 3-22, which shows o'(r) for a sawtooth waveform with 15

harmonics using polynomials of order 2 and order 4 in the range +/- 0.045, which corresponds to

+/- 1/8 semitones. The curve has been scaled to have a maximum of 1. The large circles

correspond to candidates separated by 1/8 semitones, which is the interval used in our

implementation of SWIPE and SWIPE' for the distance between pitch candidates for which the

pitch strength is computed directly. The other markers correspond to candidates separated by

1/64 semitones, which is the resolution used to fine tune the pitch strength curve based on the

pitch strength of the candidates for which the pitch strength is computed directly. As observed in












t 0.99




[]ss I C sntplate (odr2
Interpolated (order 4)

-0.04 -0.02 0 0.02 0.04
Nomalized time z (2xfTO-2x)

Figure 3-22. Interpolated pitch strength.










the figure, for such small values of r, the pitch strength values obtained with an order 2

polynomial (squares) are indistinguishable from the ones obtained with an order 4 polynomial

(diamonds). Hence, a parabola is good enough to estimate the pitch strength between candidates

separated at distances as small as 1/8 semitones.

3.11 Summary

This chapter described the SWIPE algorithm and its variation SWIPE'. The initial

approach of the algorithm was the search for the frequency that maximizes the average peak-to-

valley distance at harmonic locations. Several modifications to this idea were applied to improve

its performance: the locations of the harmonics were blurred, the spectral amplitude and the

frequency scale were warped, an appropriate window type and size were chosen, and

simplifications to reduce computational cost were introduced. After these modifications, SWIPE

estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best

matches the spectrum of the input signal. Its variation, SWIPE', uses only the first and prime

harmonics of the signal.









CHAPTER 4
EVALUATION

To asses the relevance of SWIPE and SWIPE', they were compared against other

algorithms using two speech databases and a musical instruments database. This chapter presents

a brief description of these algorithms, databases, and the evaluation process. A more detailed

description is given in Appendix B.

4.1 Algorithms

The algorithms with which SWIPE and SWIPE' were compared were the following:

* AC-P: This algorithm (Boersma, 1993) computes the autocorrelation of the signal and
divides it by the autocorrelation of the window used to analyze the signal. It uses post-
processing to reduce discontinuities in the pitch trace. It is available with the Praat System
at (http://www.fon.hum.uva.nl/praat). The name of the function is ac.

* AC-S: This algorithm uses the autocorrelation of the cubed signal. It is available with the
Speech Filing System at (http://www.phon.ucl .ac.uk/resource/sfs). The name of the
function is fxac.

* ANAL: This algorithm (Secrest and Doddington, 1983) uses autocorrelation to estimate
the pitch, and dynamic programming to remove discontinuities in the pitch trace. It is
available with the Speech Filing System at (http://www.phon.ucl .ac.uk/resource/sfs). The
name of the function is fxanal.

* CATE: This algorithm uses a quasi autocorrelation function of the speech excitation signal
to estimate the pitch. We implemented it based on its original description (Di Martino,
1999). The dynamic programming component used to remove discontinuities in the pitch
trace was not implemented.

* CC: This algorithm uses cross-correlation to estimate the pitch and post-processing to
remove discontinuities in the pitch trace. It is available with the Praat System at
(http://www.fon.hum.uva.nl/praat). The name of the function is cc.

* CEP: This algorithm (Noll, 1967) uses the cepstrum of the signal and is available with the
Speech Filing System at (http://www.phon.ucl .ac.uk/resource/sfs). The name of the
function is fxcep.

* ESRPD: This algorithm (Bagshaw, 1993; Medan, 1991) uses a normalized
cross-correlation to estimate the pitch, and post-processing to remove discontinuities
in the pitch trace. It is available with the Festival Speech Filing System at
(http://www.cstr. ed.ac.uk/proj ects/festival). The name of the function i s pda~.










* RAPT: This algorithm (Secrest and Doddington, 1983) uses a normalized cross-
correlation to estimate the pitch, and dynamic programming to remove discontinuities
in the pitch trace. It is available with the Speech Filing System at
(http:.//www.phon.ucl_ ac.uk/resource/sfs). The name of the function is fxrapt.

* SHS: This algorithm (Hermes, 1988) uses subharmonic summation. It is available with the
Praat System at (http ://www.fon.hum.uva.nl/praat). The name of the function is shs.

* SHR: This algorithm (Sun, 2000) uses the subharmonic-to-harmonic ratio. It is available at
Matlab Central (http ://www.mathworks.com/matlabcentral) under the title "Pitch
Determination Algorithm". The name of the function is ship.

* TEMPO: This algorithm (Kawahara et al., 1999) uses the instantaneous frequency of the
outputs of a filterbank. It is available with the STRAIGHT System at its author web page
(http ://www.wakayama-u.ac.j p/~kawahara). The name of the function is exstraightsource.

* YIN: This algorithm (de Cheveigne and Kawahara, 2002) uses a modified version of the
average squared difference function. It is available from its author web page at
(http ://www.ircam.fr/pcm/cheveign/sw/yin.zip). The name of the function is yin.

4.2 Databases

The databases used to test the algorithms were the following:

* DVD: Disordered Voice Database. This database contains 657 samples of sustained
vowels produced by persons with disordered voice. It can be bought from Kay Pentax
(http ://www.kayelemetrics.com).

* KPD: Keele Pitch Database. This speech database was collected by Plante et. al (1995) at
Keele University with the purpose of evaluating pitch estimation algorithms. It contains
about 8 minutes of speech spoken by five males and five females. Laryngograph data was
recorded simultaneously with speech, and was used to produce estimates of the
fundamental frequency. It is publicly available at (ftp://ftp.cs.keele. ac.uk/pub/pitch).

* MIS: M~usical hIstruntents Samples. This database contains more than 150 minutes of
sound produced by 20 different musical instruments. It was collected at the University of
lowa Electronic Music Studios, directed by Lawrence Fritts, and is publicly available at
(http:.//theremin. music.uiowa. edu).

* PBD: Paul Bagshaw 's Database for evaheating pitch determination algorithms. This
database contains about 8 minutes of speech spoken by one male and one female.
Laryngograph data was recorded simultaneously with speech, and was used to produce
estimates of the fundamental frequency. It was collected by Paul Bagshaw at the
University of Edinburg (Bagshaw et. al 1993; Bagshaw 1994), and is publicly available at
(http://www.cstr. ed. ac.uk/research/proj ects/fda).









4.3 Methodology

The algorithms were asked to produce a pitch estimate every millisecond. The search range

was set to 40-800 Hz for speech and 30-1666 Hz for musical instruments. The algorithms were

given the freedom to decide if the sound was pitched or not. However, to compute our statistics,

we considered only the time instants at which all the algorithms agreed that the sound was

pitched.

Special care was taken to account for time misalignments. Specifically, the pitch estimates

were associated to the time corresponding to the center of their respective analysis windows, and

when the ground truth pitch varied over time (i.e., for PBD and KPD), the estimated pitch time

series were shifted within a range of 100 ms to find the best alignment with the ground truth.

The performance measure used to compare the algorithms was the gross error rate (GER).

A gross error occurs when the estimated pitch is off from the reference pitch by more than 20%.

At first glance this margin of error seems too large, but considering that most of the errors pitch

estimation algorithms produce are octave errors (i.e., halving or doubling the pitch), this is a

reasonable metric. On the other hand, this tolerance gives room for dealing with misalignments.

The GER measure has been used previously to test PEAs by other researchers (Bagshaw, 1993;

Di Martino, 1999; de Cheveigne and Kawahara, 2002).

4.4 Results

Table 4-1 shows the GERs for each of the algorithms over each of the speech databases.

Both the rows and the columns are sorted by average GER: the best algorithms are at the top, and

the more difficult databases are at the right. The best algorithm overall is SWIPE', followed by

SHS and SWIPE. Although on average SHS performs better than SWIPE, the only database in

which SHS beats SWIPE is in the disordered voice database, which indicates that SWIPE

performs better than SHS on normal speech.





Table 4-1. Gross error rates for speech*


Gross error (%)
KPD DVD


Algorithm PBD
SWIPE' 0.13
SHS 0.15
SWIPE 0.15
RAPT 0.75
TEMPO 0.32
YINT 0.33
SHR 0.69
ESRPD 1.40
CEP 6.10
AC-P 0.73
CATE 2.60
CC 0.48
ANAL 0.83
AC-S 8.80
Average 1.70
* Values computed using two significant digits.


Average
0.53
0.75
0.91
1.40
1.40
2.10
3.50
5.00
5.90
6.70
6.60
2.40
13.00
19.00
4.90


0.83
1.00
0.87
1.00
1.90
1.40
1.50
3.90
4.20
2.90
10.00
3.60
2.00
7.00
3.00


0.63
1.10
1.70
2.40
2.00
4.50
5.10
4.60
14.00
16.00
7.20
5.00
35.00
40.00
9.90


Table 4-2. Proportion of overestimation

Algorithm DVD
CC 0.0
SHS 0.0
RAPT 0.0
SHR 0.0
AC 0.0
AC 0.0
ANAL 0.0
CEP 0.4
SWIPE' 0.0
SWIPE 0.1
YINT 0.1
TEMPO 0.1
CATE 0.5
ESRPD 0.5
Average 0.1
* Values computed using one significant digit.


errors relative to total gross errors*
Proportion of overestimations
PBD KPD


Average


Table 4-2 shows the proportion of GEs caused by overestimations of the pitch with respect

to the total number of GEs. The proportion of GEs caused by underestimation of the pitch is just



































One minus the values shown in the table. Algorithms at the top have a tendency to underestimate

the pitch while algorithms at the bottom have a tendency to overestimate it. Most algorithms tend

to underestimate the pitch in the disordered voice database while the errors are more balanced in

the normal speech databases.

Table 4-3 shows the pitch estimation performance as a function of gender for the two

databases for which we had access to this information: PVD and KPD. The error rates are on

average larger for female speech than for male speech.

Table 4-4 shows the GERs for the musical instruments database. Some of the algorithms

were not evaluated on this database because they did not provide a mechanism to set the search

range, and the range they covered was smaller that the pitch range spanned by the database. The

two algorithms that performed the best were SWIPE' and SWIPE.


Table 4-3. Gross error rates by gender*

Algorithm Male
SWIPE'
SHS
SWIPE
RAPT
TEMPO
SHR

AC-P
CEP
CC
ESRPD
ANAL
AC-S
CATE
Average
* Values computed using two significant digits.


Gross error (%)
Female
2.40
2.50
2.70
2.90
3.10
3.60
3.20
3.60
4.20
4.50
3.90
5.90
10.00
4.20
4.00


Average


0.36
0.55
0.49
0.42
0.67
0.61
1.10
2.10
1.80
2.40
3.10
1.30
3.20
11.00
2.10










Table 4-4. Gross error rates for musical instruments*


Gross error (%)
Overestimates
0.10
0.02
1.00
1.70
0.83
0.00
0.00
1.50
5.30
1.20


Algorithm Underestimates
SWIPE' 1.00
SWIPE 1.30
SHS 0.88
TEMPO 0.29
YINT 1.60
AC-P 3.20
CC 3.60
ESRPD 5.30
SHR 15.00
Average 3.60
* Values computed using two significant digits.


Total


1.10
1.30
1.90
2.00
2.40
3.20
3.60
6.80
20.00
4.70


Table 4-5. Gross error rates by instrument family*
Gross error (%)
Bowed


Plucked
Strings
8.80
11.00
4.00
14.00
8.10
26.00
28.00
11.00
15.00
14.00


Algorithm
SWIPE'
SWIPE
TEMPO

SHS
AC-P
CC
ESRPD
SHR
Average


Brass
0.01
0.00
0.00
0.03
0.02
0.03
0.07
4.00
22.00
2.90


Strings
0.19
0.22
2.60
1.10
1.50
0.56
0.83
6.90
25.00
4.30


Woodwinds
0.14
0.23
1.40
1.50
0.72
0.80
1.00
7.10
38.00
5.60


Piano
2.20
0.02
7.30
0.36
12.00
0.36
0.36
6.00
26.00
6.10


Average
2.30
2.30
3.10
3.40
4.50
5.60
6.00
7.00
25.00
6.60


* Values computed using two significant digits. Brass: French horn, bass/tenor trombones, trumpet, and tuba.
Bowed strings: double bass, cello, viola, and violin. Woodwinds: flute, bass/alto flutes, bass/Bb/Eb clarinets,
alto/soprano saxes. Plucked strings: cello and violin.


Table 4-5 shows the GERs by instrument family. The two best algorithms are SWIPE' and

SWIPE. SWIPE' tends to perform better than SWIPE except for the piano, for which SWIPE

produces almost no error. On the other hand, SWIPE' performance on piano is relatively bad

compared to correlation based algorithms. The family for which fewer errors were obtained was

the brass family; many algorithms achieved almost perfect performance for this family. The










Table 4-6. Gross error rates for musical instruments by octave*
Gross error (%)


46.2 Hz
+/- 1/2
oct.
1.20
0.08
3.20
0.24
7.80
0.26
15.00
7.90
37.00
8.10


92.5 Hz
+/- 1/2
Oct.
1.00
1.20
0.95
2.00
2.60
2.60
2.80
2.60
0.60
1.80


185 Hz
+/- 1/2
Oct.
2.30
3.00
5.30
7.80
3.20
8.20
2.00
4.80
1.80
4.30


370 Hz
+/- 1/2
Oct.
0.89
1.00
1.80
2.50
1.20
2.70
1.10
4.20
27.00
4.70


740 Hz
+/- 1/2
Oct.
0.13
0.25
0.69
0.71
0.23
0.93
0.52
12.00
70.00
9.50


1480 Hz
+/- 1/2
Oct.
0.29
0.38
0.96
0.30
0.14
0.40
0.31
32.00
81.00
13.00


Algorithm
SWIPE'
SWIPE

AC-P
SHS
CC
TEMPO
ESRPD
SHR
Average


Average
0.97
0.99
2.20
2.30
2.50
2.50
3.60
11.00
36.00
6.90


* Values computed using two significant digits.


family for which more errors were produced was the strings family playing pizzicato, i.e., by

plucking the strings. Indeed, pizzicato sounds were the ones for which the performers produced

more errors and the ones that were hardest for us to label (see Appendix B).

Table 4-6 shows the GERs as a function of octave. The best performance on average was

achieved by SWIPE' and SWIPE. The results of the algorithms with an average GER less than





Pitch (Hz)
46.2 92.5 185 370 740 1480
100.0

f ~SWIPE'
10.0-SWP


[E 1.01 b -X-- AC-P
0 prt SHS
0.1-
--+ TEMPO

0.0

Figure 4-1. Gross error rates for musical instruments as a function of pitch.










Table 4-7. Gross error rates for musical instruments by dynamic*
Gross error (%)
Algorithm pp mf ff Average
SWIPE' 1.30 1.20 0.92 1.10
SWIPE 1.40 1.40 1.20 1.30
SHS 1.50 2.30 2.00 1.90
TEMPO 2.00 1.90 2.00 2.00
YINT 2.20 2.50 2.40 2.40
AC-P 3.30 3.20 3.30 3.30
CC 3.60 3.30 3.80 3.60
ESRPD 5.70 7.10 7.60 6.80
SHR 27.00 29.00 29.00 28.00
Average 5.30 5.80 5.80 5.60
* Values computed using two significant digits.


10% is reproduced in Figure 4-1. All algorithms have approximately the same tendency, except

at the lowest octave, where a larger variance in the GERs can be observed.

Table 4-7 shows the GERs as a function of dynamic (i.e., loudness). In general, there is no

significant variation of GERs with changes in loudness, although SWIPE' has a tendency to

reduce the GER as loudness increases [i.e., as the dynamic moves from pianissimo (pp) to

fortissimo (ff) ].

As a final test, we wanted to validate the choices we made in Chapter 3, i.e., shape of the

kernel, warping of the spectrum, weighting of the harmonics, warping of the frequency scale, and

selection of window type and size. For this purpose, we evaluated SWIPE' replacing every time

one of its features with a more standard feature, i.e., smooth vs. pulsed kernels, square-root vs.

raw spectrum, decaying vs. flat kernel envelope, ERB vs. Hertz frequency scale, and pitch-

optimized vs. fixed window size. We varied each of these variables independently and obtained

the results shown in Table 4-8. Although some of the variations made SWIPE' improve in some

of the databases, overall SWIPE' worked better with the features we proposed in Chapter 3.










Table 4-8. Gross error rates for variations of SWIPE'*
Gross error (%)
Variation PBD KPD DVD MIS Average
Original 0.13 0.83 0.63 1.10 0.67
Flat envelope 0.16 1.00 1.40 0.60 0.79
Hertz scale 0.23 1.70 1.40 0.37 0.93
Pulsed kernel 0.21 0.84 3.00 2.60 1.70
Raw spectrum2 0.25 2.10 1.60 4.90 2.20
Fixed WS3 0.15 0.77 1.70 9.10 2.90
* Values computed using two significant digits. FFTs were computed using optimal window sizes and the
spectrum was inter/extrapolated to frequency bins separated at 5 Hz.2 The use of the raw spectrum rather than the
square root of the spectrum implies the use of a kernel whose envelope decays as 1/frather than 1/-\/f to match the
spectral envelope of a sawtooth waveform.3 The power-of-two window size whose optimal pitch was closest to the
geometric mean pitch of the database was used in each case. A window of size 1024 samples was used for the
speech databases and a window of size of 256 samples was used for the musical instruments database.


4.5 Discussion

SWIPE' showed the best performance in all categories. SWIPE was the second best ranked

for musical instruments and normal speech but not for disordered speech, for which SHS

performed better (see Table 4-1). One possible reason is that it is common for disordered voices

to have energy at multiples of subharmonics of the pitch, and therefore algorithms that apply

negative weights to the spectral regions between harmonics (e.g., SWIPE, SWIPE', and all

autocorrelation based algorithms) are prone to produce low scores for the pitch. Although

SWIPE' is among this group, its use of only the first and prime harmonics, reduces substantially

the score subharmonics of the pitch, producing most of the time a larger score for the pitch than

for its subharmonics.

The rankings of the algorithms are relatively stable in all the tables except for SHR, which

showed a good performance for speech but not for musical instruments. We believe this is

caused by the wide pitch range spanned by the musical instruments. This is suggested by the

results in Table 4-6, which show that SHR performs well in the octaves around 92.5 Hz and 185










Hz, which corresponds to the pitch region of speech, but performs very bad as the pitch moves

from this region.

Figure 4-1 shows that the relative trend on performance with pitch for musical instruments

is about the same for all the algorithms except in the lowest region, where a large variance in

performance was observed. However, this variance may be caused by a significant reduction in

the numbers of samples in this region (about 4% of the data). The figure also shows an overall

increase in GER in the octave around 185 Hz. We believe this is caused by the presence of a set

of difficult sounds in the database with pitches in that region, since it is hard to believe that there

is an inherent difficulty of the algorithms to recognize pitch in that region.









CHAPTER 5
CONCLUSION

The SWIPE pitch estimator has been developed. SWIPE estimates the pitch as the

fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of

the input signal. The schematic description of the algorithm is the following:

1. For each pitch candidate within a pitch range fmin-fmax, COmpute its pitch strength as follows:

a. Compute the square-root of the spectrum of the signal.

b. Normalize the square-root of the spectrum and apply an integral transform using a
normalized cosine kernel whose envelope decays as 1/-\f:

2. Estimate the pitch as the candidate with highest strength.

An implicit objective of the algorithm was to find the frequency for which the average

peak-to-valley distance at its harmonics is maximized. To achieve this, the kernel was set to zero

below the first negative lobe and above the last negative lobe, and to avoid bias, the magnitude

of these two lobes was halved.

To make the contribution of each harmonic of the sawtooth waveform proportional to its

amplitude and not to the square of its amplitude, the square-root of the spectrum was taken

before applying the integral transform.

To make the kernel match the normalized square-root spectrum of the sawtooth waveform,

a 1/-\f envelope was applied to the kernel. The kernel was normalized using only its positive

part.

To maximize the similarity between the kernel and the square-root of the input spectrum,

each pitch candidate required its own window size, which in general is not a power of two, and

therefore not ideal to compute an FFT. To reduce computational cost, the two closest power-of-

two window sizes were used, and their results are combined to produce a single pitch strength

value. This had the extra advantage of allowing an FFT to be shared by many pitch candidates.









Another technique used to reduce computational cost was to compute a coarse pitch strength

curve and then fine tune it by using parabolic interpolation. A last technique used to reduce

computational cost was to reduce the amount of window overlap while allowing the pitch of a

signal as short as four cycles to be recognized.

The ERB frequency scale was used to compute the spectral integral transform since the

density of this scale decreases almost proportionally to frequency, which avoids wasting

computation in regions where there little spectral energy is expected.

SWIPE', a variation of SWIPE, uses only the first and prime harmonics of the signal,

producing a large reduction in subharmonic errors by reducing significantly the scores of

subharmonics of the pitch.

Except for the obvious architectural decisions that must be taken when creating an

algorithm (e.g., selection of the kernel), there are no free parameters in SWIPE and SWIPE', at

least in terms of "magic numbers".

SWIPE and SWIPE' were tested using speech and musical instruments databases and their

performance was compared against twelve other algorithms which have been cited in the

literature and for which free implementations exist. SWIPE' was shown to outperform all the

algorithms on all the databases. SWIPE was ranked second in the normal speech and musical

instruments databases, and was ranked third in the disordered speech database.











APPENDIX A
MATLAB IMPLEMENTATION OF SWIPE'


This is a Matlab implementation of SWIPE'. To convert it into SWIPE just replace


[ 1 primes(n) ] in the for loop of the function pirId.r Il;lo egthneCa~ndidate with [ 1:n ].

function [p,t,s] = swipep(x,fs,plim,dt,dlog2p,dERBs,sTHR)
SSWIPEP Pitch estimation using SWIPE'.
SP = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2F,DERBS,STHR) estimates the pitch of
Sthe vector signal X with sampling frequency Fs (in Hertz) every DT
Seconds. The pitch is estimated by sampling the spectrum in the ERB scale
Using a step of size DERBS ERBs. The pitch is searched within the range
S[PMIN PMAX] (in Hertz) sampled every DLOG2P units in a base-2 logarithmic
Scale of Hertz. The pitch is fine tuned by using parabolic interpolation
With a resolution of 1/64 of semitone (approx. 1.6 cents). Pitches with a
Strength lower than STHR are treated as undefined.

S[P,T,S] = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2P,DERBS,S/thr) returns the times
ST at which the pitch was estimated and their corresponding pitch strength.

SP = SWIPEP(X,Fs) estimates the pitch using the default settings PMIN =
S30 Hz, PMAX = 5000 Hz, DT = 0.01 s, DLOG2P = 1/96 (96 steps per octave),
SDERBS = 0.1 ERBs, and STHR = -Inf.

SP = SWIPEP(X,Fs,...[],...) uses the default setting for the parameter
Replaced with the placeholder [].

SEXAMPLE: Estimate the pitch of the signal X every 10 ms within the
range 75-500 Hz using the default resolution (i.e., 96 steps per
Octavee, sampling the spectrum every 1/20th of ERB, and discarding
samples with pitch strength lower than 0.4. Plot the pitch trace.
S[x,Fs] = wavread(filename);
S[p,t,s] = swipep(x,Fs,[75 500],0.01,[],1/20,0.4);
plot(1000*t,p)
Sxlabel('Time (ms)')
ylabel('Pitch (Hz)')if ~ exist( 'plim' ) | isempty(plim), plim = [30
5000]; end
if ~ exist( 'dt' ) | isempty(dt), dt = 0.01; end
if ~ exist( 'dlog2f' ) | isempty(dlog2f), dlog2f = 1/96; end
if ~ exist( 'dERBs' ) | isempty(dERBs), dERBs = 0.1; end
if ~ exist( 'sTHR' ) | isempty(sTHR), sTHR = -Inf; end
t = [ 0: dt: length(x)/fs ]'; X Times
dc = 4; X Hop size (in cycles)
K = 2; X Parameter k for Hann window
g Define pitch candidates
log2pc =[ log2(plim(1)): dlog2f: log2(plim(end)) ]';
pc = 2 .^ log2pc;
S = zeros( length(pc), length(t) ); X Pitch strength matrix
3 Determine P2-WSs
logWs = round( log2( 4*K fs ./ plim ))
ws = 2.^[ logWs(1): -1: logWs(2) ]; X P2-WSs
pO = 4*K fs ./ ws; X Optimal pitches for P2-WSs
g Determine window sizes used by each pitch candidate
d = 1 + log2pc log2( 4*K*fs./ws(1) );











% Create ERBs spaced frequencies (in Hertz)
fERBs = erbs2hz([ hz2erbs Opc(1)/4): dERBs: hz2erbs(fs/2) ])
for i = 1 : length(ws)
dn = round( dc fs / pO(i) ); % Hop size (in samples)
% Zero pad signal
xzp =[ zeros( ws(i)/2, 1 ); x(:); zeros( dn + ws(i)/2, 1 ) ];
% Compute spectrum
w = hanning( ws(i) ); % Hann window
o = max( 0, round( ws(i) dn ) ); % Window overlap
[X, f, ti ]= specgram( xzp, ws(i), fs, w, O )
% Interpolate at equidistant ERBs steps
M = max( 0, interpl( f, abs (X), fERBs, 'spline', 0) ); % Magnitude
L = sqrt( M ); % Loudness
% Select candidates that use this window size
if i==1ength(ws); j=find(d-i>-1); k=find(d(j)-i<0);
elseif i==1; j=find(d-i<1); k=find(d(j)-i>0);
else j=find(abs(d-i)<1); k=1:1ength(j);
end
Si = pitchStrengthAllCandidates( fERBs, L, pc(j) );
% Interpolate at desired times
if size(Si,2) > 1
Si = interpl( ti, Si', t, 'linear', NaN )';
else
Si = repmat( NaN, length(Si), length(t) );
end
lambda = d( j (k) ) i;
mu = ones( size(j) );
mu(k) = 1 abs( lambda )
S(j,:) = S(j,:) + repmat(mu,1,size(Si,2)) .* Si;
end
% Fine-tune the pitch using parabolic interpolation
p = repmat( NaN, size(S,2), 1 );
s = repmat( NaN, size(S,2), 1 )
for j = 1 : size(S,2)
[s(j), i ]= max( S(:,j) )
if s(j) < sTHR continue, end
if i==1, p(j)=pc(1); elseif i==1ength(pc), p(j)=pc(1); else

tc = 1 ./ pc(I);
ntc = ( tc/tc(2) 1 ) 2*pi;
c = polyfit( ntc, S(I,j), 2 );
ftc = 1 ./ 2.^[ log2 Opc(I(1))): 1/12/64: log2 Opc(I(3))) ];
nftc = ( ftc/tc(2) 1 ) 2*pi;
[s(j) k] = max( polyval( c, nftc ) );
p(j) = 2 ^ ( log2 Opc(I(1))) + (k-1)/12/64 )
end
end

function S = pitchStrengthAllCandidates( f, L, pc)
% Normalize loudness
warning off MATLAB:divideByZero
L = L ./ repmat( sqrt( sum(L.*L) ), size(L,1), 1 );
warning on MATLAB:divideByZero
% Create pitch salience matrix
S =zers(length(pc), size(L,2) )
for j = 1 : length(pc)
S(j,:) = pitchStrengthOneCandidate( f, L, pc(j) );











end


function S = pitchStrengthOneCandidate( f, L, pc)
n = fix( f(end)/pc 0.75 ); % Number of harmonics
k = zeros( size(f) ); % Kernel
q = f / pc; % Normalize frequency w.r.t. candidate
for i = [1 primes (n)]
a = abs( q i )
% Peak's weigth
p= a< .25;
k(p) = cos( 2*pi q(p) );
% Valleys' weights
v = .25 < a & a < .75;
k(v) = k(v) + cos( 2*pi q(v) ) / 2;
end
% Apply envelope
k = k .* sqrt( 1./f )
% K+-normalize kernel
k = k / norm( k(k>0) );
% Compute pitch strength
S = k' L;

function erbs = hz2erbs (hz)
erbs = 21.4 logl0( 1 + hz/229 );

function hz = erbs2hz(erbs)
hz = ( 10 .^ (erbs./21.4) 1 ) 229;










APPENDIX B
DETAILS OF THE EVALUATION

B.1 Databases

All the databases used in this work are free and publicly available on the Internet, except

the disordered voice database. Besides speech recordings, the speech databases contain

simultaneous recordings of laryngograph data, which facilitates the computation of the

fundamental frequency. The authors of these databases used them to produce ground truth pitch

values, which are also included in the databases. The disordered voice database includes

fundamental frequency estimates, but as it will be explained later, a different ground truth data

set was used. The musical instruments database contains the names of the notes in the names of

the files.

B.1.1 Paul Bagshaw's Database

Paul Bagshaw's database (PBD) for evaluation of pitch determination algorithms

(Bagshaw et. al 1993; Bagshaw 1994) was collected at the University of Edinburgh, and is

available at (http://www. cstr. ed.ac.uk/research/proj ects/fda). The speech and laryngograph

signals of this database were sampled at 20 k
computed by estimating the location of the glottal pulses in the laryngograph data and taking the

inverse of the distance between each pair of consecutive pulses. Each fundamental frequency

estimate is associated to the time instant in the middle between the pair of pulses used to derive

the estimate.

B.1.2 Keele Pitch Database

The Keele Pitch Database (KPD) (Plante et. al, 1995) was created at Keele University and

is available at (ftp://ftp. cs.keele.ac.uk/pub/pitch). The speech and laryngograph signals were

sampled at 20 k









26.5 ms window shifted at intervals of 10 ms. Windows where the pitch is unclear are marked

with special codes.

Both of these speech databases PBD and KPD have been reported to contain errors (de

Cheveigne, 2002), especially at the end of sentences, where the energy of speech decays and

malformed pulses may occur. We will explain later how we deal with this problem.

B.1.3 Disordered Voice Database

The disordered voice database (DVD) was collected by Kay Pentax

(http ://www.kayelemetrics.com). It includes 657 disordered voice samples of the sustained

vowel "ah" sampled at 25 k
patients with a wide variety of organic, neurological, traumatic, psychogenic, and other voice

di orders.

The database includes fundamental frequency estimates, but by definition, they do not

necessarily match their pitch. Therefore we estimated the pitch by ourselves by listening to the

samples through earphones, and matching the pitch to the closest note, using as reference a

synthesizer playing sawtooth waveforms. Assuming that we chose one of the two closest notes

every time, this procedure should introduce an error no larger than 6%, which is smaller than the

20% necessary to produce a GE (see Chapter 4).

There were some samples for which the pitch ranged over a perfect fourth or more (i.e., the

higher pitch was more than 33% higher than the lower pitch). Since this range is large compared

to the permissible 20%, these samples were excluded. Samples for which the range did not span

more than a maj or third (i.e., the higher pitch was no more than 26% higher than the lower pitch)

were preserved, and they were assigned the note corresponding to the median of the range. If the

median was between two notes, it was assigned to any of them. This should introduce an error no










larger than two semitones (12%), which is about half the maximum permissible error of 20%.

There were 30 samples for which we could not perceive with confidence a pitch, so they were

excluded as well.

Since the ground truth data was based on the perception of only one listener (the author), it

could be argued that this data has low validity. To alleviate this, we excluded the samples for

which the minimum error produced by any algorithm was larger than 50%.

After excluding the non-pitch, variable pitch, and samples at which the algorithms

disagreed with the ground truth, we ended up with 612 samples out of the original 657.

Appendix C shows the ground truth used for each of these 612 samples.

B.1.4 Musical Instruments Database

The musical instruments samples database was collected at the University of Iowa, and is

available at (http ://theremin.music.uiowa.edu). The recordings were made using CD quality

sampling at a rate of 44,100 k
computational cost. No noticeable change of perceptual pitch was perceived by doing this, even

for the highest pitch sounds. This database contains recordings of 20 instruments, for a total of

more than 150 minutes and 4,000 notes. The notes are played in sequence using a chromatic

scale with silences in between. Each file usually spans one octave and is labeled with the name

of the initial and final notes, plus the name of the instrument, and other details (e.g.,

Violin.pizz.mf. sulG. C4B4.aiff).

In order to test the algorithms, the files were split into separate files containing each of

them a single note with no leading or trailing silence. This process was done in a semi-automatic

way by using a power-based segmentation method, and then checking visually and auditively the

quality of the segmentation.









While doing this task it was discovered that some of the note labels were wrong. The

intervals produced by the performers were sometimes larger than a semitone, and therefore the

names of the files did not correspond to the notes that were in fact played. This situation was

common with string instruments, especially when playing in pizzicato.

Therefore, after splitting the files, we listened to each of them, and manually corrected the

wrong names by using as reference an electronic keyboard. This procedure sometimes

introduced name conflicts (i.e., there were repeated notes played by the same instrument, same

dynamic, etc.), and when this occurred, we removed the repeated notes trying to keep the closest

note to the target. When the conflicting notes were equally close to the target, the "best quality"

sound was preserved. This removal of files was done to avoid the overhead of having to add

extra symbols to the file names to allow for repetitions, which would have complicated the

generation of scripts to test the algorithms.

Since this process of manually correcting the names of the notes was very tedious,

especially for the pizzicato sounds, after fixing all the pizzicato bass and violin notes, the process

was abandoned and the cello and viola pizzicato sounds were excluded from our evaluation.

Arguably, except for the bass, pizzicato sounds are not very common in music, and therefore

leaving the cello and viola pizzicato sounds out did not affect the representativeness of the

sample significantly.

B.2 Evaluation Using Speech

Whenever possible, each of the algorithms was asked to give a pitch estimate every

millisecond within the range 40-800 Hz, using the default settings of the algorithm (an exception

was made for ESRPD: instead of using the default settings in the Festival implementation, the

recommendations suggested by the author of the algorithm were followed). The range 40-800

was used to make the results comparable to the results published by de Cheveigne (2002).









However, a full comparison is not possible since some other variables were treated differently in

that study.

The commands issued for each of the algorithms were the following

* AC-P: To Pitch (ac)... 0.001 40 15 no 0.03 0.45 0.01 0.35 0.14 800
* AC-S: fxac input~file
* ANAL: fxanal input~file
* CC: To Pitch (cc)... 0.001 40 15 no 0.03 0.45 0.01 0.35 0.14 800
* CEP: fxcep input~file
* ESRPD: pda input~file -o output~file -L -d 1 -shift 0.001 -length 0.0384 -fmax 800 -fmin
40 -lpfilter 600
* RAPT: fxrapt input~file
* SHS: To Pitch (shs)... 0.001 40 15 1250 15 0.84 800 48
* SHR: [t, p ] =shrp( x, fs, [40 800], 40, 1, 0.4, 1250, 0, 0 );
* SWIPE: [ p, t ] = swipe( x, fs, [40 800], 0.001, 1/96, 0.1, -Inf );
* SWIPE': [ p, t ] = swipep( x, fs, [40 800], 0.001, 1/96, 0.1i, -Inf );
* TEMPO: f0raw = exstraightsource( x, fs );
* YIN: p.minf0 = 40; p.maxf0 = 800; p.hop = 20; p.sr = fs; r = yin( x, p );

where x is the input signal and f, is the sampling rate in Hertz.

An important issue that had to be considered was the time associated to each pitch

estimate. Since all algorithms use symmetric windows, a reasonable choice was to associate each

estimate to the time at the center of the window. For CATE, ESRPD, and SHR, the user is

allowed to determine the size of the window, so we followed the recommendation of their

authors and we set the window sizes to 51.2, 38.4, and 40 ms, respectively. YIN uses a different

window size for each pitch candidate, but the windows are always centered at the same time

instant, and the largest window size is two periods of the largest expected pitch period. For the

Praat's algorithms AC-P, CC, and SHS, through trial and error we found that they use windows

of size 3, 1, and 2 times the largest expected pitch period, respectively. For AC-S, ANAL, CEP,

RAPT, and TEMPO, the user is not allowed to set up the window size, but the algorithms output



6 The command for CATE is not reported because we used our own implementation.









the time instants associated to each pitch estimate, so we used these times hoping that they

correspond to the centers of the analysis windows used to determine the pitch.

The times associated to the pitch ground truth series are explicitly given in the PBD

database, but not in the KPD database. For KPD, each pitch value was associated to the center of

the window. Therefore, since the ground truth pitch values were computed using 26.5 msec

windows separated at a distance of 10 msec, the first pitch estimate was assigned a time of 13.25

msec, and the time associated to each successive pitch estimate added 10 msec to the time of the

previous estimate. For the DVD databases, each vowel was assumed to have a constant pitch, so

the ground truth pitch time series was assumed to be constant.

The purpose of the evaluation was to compare the pitch estimates of the algorithms, but not

their ability to distinguish the existence of pitch. Therefore, we included in the evaluation only

the regions of the signal at which all algorithms and the ground truth data agreed that pitch

existed. To achieve this, we took the time instants of the ground truth values and the time

instants produced by all the algorithms that estimated the pitch every millisecond (9 out of 13

algorithms), rounded them to the closest multiple of 1 millisecond, and took the intersection.

This intersection would form the set of times at which all the algorithms would be evaluated. The

algorithms that produced pitch estimates at a rate lower than 1,000 per second were not

considered for finding the intersection because that would reduce the time granularity of our

evaluation, which was desired to be one millisecond.

As suggested in the previous paragraph, some algorithms do not necessarily produce pitch

estimates at times that are multiples of one millisecond, i.e., they may produce the estimates at

the times t+ At ms, where t is an integer and |At| < 1. Thus, to evaluate them at multiples of one

millisecond, the pitch values at the desired times were inter/extrapolated in a logarithmic scale.









In other words, we took the logarithm of the estimated pitches, inter/extrapolated them to the

desired times, and took the exponential of the inter/extrapolated pitches. Inter/extrapolation in

the logarithmic domain was preferred because we believe this is the natural scale for pitch. This

is what allows us to recognize a song even if it is sung by a male or a female.

An important issue that must be considered when using simultaneous recordings of the

laryngograph and speech signals is that the latter are typically delayed with respect to the former.

An attempt to correct this misalignment was reported by the authors of KPD, but the success was

not warranted. No attempt of correction was reported for PBD. Since pitch in speech is time-

varying, such misalignment could increase the estimation error significantly. To alleviate this

problem, each pitch time series produced by each algorithm was delayed or advanced, in steps of

1 msec, and up to 100 msec, in order to find the best match with the ground truth data.

B.3 Evaluation Using Musical Instruments

Considering that many algorithms were designed for speech, the pitch range of the MIS

database is probably too large for them to handle. To alleviate this, we excluded the samples that

were outside the range 30-1666 Hz, which is nevertheless large, compared to the pitch range of

speech. Since the range 30-1666 Hz was found to be too large for the Speech Filing System

algorithms (AC-S, ANAL, CEP, and RAPT) these algorithms were not evaluated on the MIS

database. The commands issued for each of the algorithms were the following:

* AC-P: To Pitch (ac)... 0.001 30 15 no 0.03 0.45 0.01 0.35 0.14 1666
* CC: To Pitch (cc)... 0.001 30 15 no 0.03 0.45 0.01 0.35 0.14 1666
* ESRPD: pda input~file -o output~file -P -d 1 -shift 0.001 -length 0.0384 -fmax 1666 -fmin
30 -n 0 -m 0
* SHS: To Pitch (shs)... 0.001 30 15 5000 15 0.84 1666 48
* SHR: [ t, p ] = shrp( x, fs, [30 1666], 40, 1, 0.4, 5000, 0, O );
* SWIPE: [ p, t ] = swipe( x, fs, [30 1666], 0.001, 1/96, 0. 1, -Inf );
* SWIPE': [ p, t ] = swipep( x, fs, [30 1666], 0.001, 1/96, 0.1, -Inf );
* YIN: p.minf0 = 30; p.maxf0 = 1666; p.hop = 10; p.sr = 10000; r = yin(x,p);









Besides the widening of the pitch range, the only difference with respect to the commands

used for the speech databases were for ESRPD and SHS. For both of them, the low-pass filtering

was removed in order to use as much information from the spectrum as possible. This was

convenient because the sounds were already low-pass filtered at 5 k
pitch sounds (around 1666 Hz) had no more than three harmonics in the spectrum. The second

change was the use of the ESRPD peak-tracker (option -P) as an attempt to make the algorithm

improve upon its results with speech.

The evaluation process was very similar to the one followed for speech: the time instants

of the ground truth and the pitch estimates were rounded to the closest millisecond, the

intersection of all the times was taken, and the statistics were computed only at the times of this

intersection. However, there was an issue that was necessary consider in this database. Some

instruments played much longer notes than others. The range of durations goes from tenths of

second for strings playing in pizzicato, to several seconds for some notes of the piano. If the

overall error is computed without taking this into account, the results will be highly biased

toward the performance produced with the instruments that play the largest notes.

To account for this, the GER was computed independently for each sample, and then

averaged over all the samples. However, this introduced an undesired effect: some samples had

very few pitch estimates (only one estimate in some cases), and therefore this procedure would

give them too much weight, which potentially would introduce noise in our results. Therefore,

we discarded the samples for which the time instants at which the algorithms were evaluated

were less than half the duration of the sample (in milliseconds). This discarded 164 samples,

resulting in a total of 3459 samples, which was nevertheless a significant amount of data to

quantify the performance of the algorithms.











APPENDIX C
GROUND TRUTH PITCH FOR THE DISORDERED VOICE DATABASE


Table C-1. Ground truth pitch values for the disordered voice database
AAKO2 220.0 AAS16 123.5 ABBO9 246.9 ABGO4 116.5 ACGl3 207.7 ACG20 164.8
ACH16 185.0 ADM14 138.6 ADPO2 155.6 ADP11 116.5 AEAO3 220.0 AFR17 246.9
AHKO2 110.0 AHS20 196.0 AJFl2 110.0 AJMO5 138.6 AJM29 123.5 AJP25 233.1
AL~Bl8 123.5 AL~W27 174.6 AL~W28 220.0 AMB22 146.8 AMC14 92.5 AMC16 146.8
AMC23 196.0 AMDO7 130.8 AMJ23 123.5 AMK25 77.8 AMPl2 220.0 AMvT11 246.9
AMV23 185.0 ANAl5 155.6 ANA20 155.6 ANB28 196.0 AOS21 110.0 ASK21 116.5
ASR20 92.5 ASR23 130.8 AWEO4 155.6 AXD11 174.6 AXDI9 196.0 AXLO4 196.0
AXL22 196.0 AXSO8 155.6 AXT 11 185.0 AXT 13 196.0 BAH13 98.0 BAS 19 293.7
BAT19 185.0 BBR24 164.8 BCMO8 233.1 BEFO5 185.0 BGSO5 246.9 BJHO5 174.6
BJKl6 174.6 BJK29 103.8 BKBl3 87.3 BLBO3 110.0 BMKO5 246.9 BMMO9 233.1
BPFO3 116.5 BRT 18 311.1 BSD30 130.8 BSGl3 174.6 BXD17 138.6 CAC10 185.0
CAHO2 196.0 CAK25 196.0 CAL~12 92.5 CAL~28 261.6 CAR10 196.0 CBD17 164.8
CBDI9 174.6 CBD21 207.7 CBR29 174.6 CCMI5 110.0 CDWO3 146.8 CEN21 92.5
CER16 185.0 CER30 174.6 CFWO4 155.6 CJB27 116.5 CJP10 98.0 CLE29 116.5
CLS31 185.0 CMAO6 123.5 CMA22 103.8 CMR0 1 185.0 CMRO6 110.0 CMR26 174.6
CMS 10 196.0 CMS25 185.0 CNPO7 196.0 CNR0 1 185.0 CPKl9 155.6 CPK21 174.6
CPW28 220.0 CRM12 185.0 CSJl6 233.1 CSYO1 110.0 CTB30 146.8 CTYO3 130.8
CXLO8 174.6 CXMO7 130.8 CXM14 220.0 CXM18 146.8 CXPO2 207.7 CXR13 146.8
CXTO8 155.6 DAC26 155.6 DAG0 1 185.0 DAMO8 174.6 DAPl7 130.8 DAS 10 146.8
DAS24 146.8 DAS30 87.3 DAS40 77.8 DBAO2 220.0 DBFl18 155.6 DBGl4 103.8
DFBO9 233.1 DFS23 293.7 DFS24 293.7 DGL30 207.7 DGOO3 110.0 DHDO8 123.5
DJF23 146.8 DJM14 130.8 DJM28 185.0 DJPO4 110.0 DLB25 261.6 DLL25 174.6
DLTO9 207.7 DLWO4 130.8 DMCO3 185.0 DMF ll 293.7 DMGO7 146.8 DMG24 196.0
DMG27 155.6 DMPO4 123.5 DMR27 233.1 DMS0 1 146.8 DOA27 92.5 DRC 15 196.0
DRGl9 116.5 DSC25 277.2 DSW14 138.6 DVDI9 164.8 DWKO4 130.8 DXS20 123.5
EAB27 164.8 EALO06 207.7 EAS11 110.0 EAS15 138.6 EAW21 207.7 EBJO3 146.8
EDGl9 196.0 EEB24 164.8 EECO4 196.0 EEDO7 554.4 EFCO8 130.8 EGK30 196.0
EGTO3 138.6 EGW23 220.0 EJB0 1 92.5 EJMO4 123.5 ELLO4 116.5 EMDO8 82.4
EML18 370.0 EMP27 174.6 EOWO4 164.8 EPWO4 164.8 EPWO7 123.5 ERSO7 185.0
ESL28 207.7 ESMO5 138.6 ESPO4 138.6 ESSO5 174.6 ESS24 220.0 EWWO5 174.6
EXEO6 146.8 EXH21 185.0 EXIO4 110.0 EXIO5 116.5 EXSO7 207.7 EXW12 164.8
FAHO 1 164.8 FGR15 130.8 FJL23 116.5 FLL27 207.7 FLW13 207.7 FMCO8 196.0
FMM21 207.7 FMM29 207.7 FMQ20 155.6 FMR17 116.5 FRH18 146.8 FSPl3 155.6
FXC12 110.0 FXE24 196.0 FXI23 103.8 GCU31 123.5 GEA24 130.8 GEKO2 138.6
GJWO9 174.6 GLB0 1 77.8 GLB22 98.0 GMMO6 196.0 GMMO7 207.7 GMSO3 110.0
GMSO5 261.6 GMW18 146.8 GRS20 110.0 GSB ll 164.8 GSLO4 116.5 GTN21 130.8
GXL21 196.0 GXT 10 155.6 GXX13 164.8 HBS12 196.0 HED26 123.5 HJHO7 130.8
HLC16 110.0 HLK01 116.5 HLKl5 130.8 HLM24 138.6 HMGO3 185.0 HML26 207.7
HWRO4 164.8 HXB20 196.0 HXI29 82.4 HXL58 116.5 HXR23 116.5 IGDO8 196.0
IGDI6 174.6 JABO8 130.8 JAB30 164.8 JAFl15 146.8 JAJ10 207.7 JAJ22 155.6
JAJ31 155.6 JAL~O5 174.6 JAM0 1 207.7 9-Jan 130.8 JAPO2 138.6 JAPl7 174.6
JAP25 174.6 JBPl4 98.0 JBR26 110.0 JBS17 82.4 JBW14 130.8 JCCO8 164.8
JCC10 207.7 JCH13 110.0 JCH21 116.5 JCL12 174.6 JCL20 146.8 JCR01 233.1
JDMO4 110.0 JEG29 246.9 JES29 123.5 JFC28 82.4 JFGO8 138.6 JFG26 138.6
JFM24 174.6 JFN11 110.0 JFN21 116.5 JHW29 146.8 JIJ30 146.8 JJDO6 174.6
JJD11 185.0 JJD29 138.6 JJIO3 110.0 JJM28 220.0 JLCO8 185.0 JLD24 233.1
JLHO3 174.6 JLM18 207.7 JLM27 123.5 JLS11 130.8 JLS18 138.6 JMC18 138.6
JME23 164.8 JMH22 155.6 JMJO4 207.7 JMZl6 196.0 JOPO7 130.8 JPBO7 98.0
JPBl7 164.8 JPB30 98.0 JPM25 110.0 JPP27 207.7 JRF30 123.5 JRP20 110.0
JSGl8 207.7 JTMO5 87.3 JTSO2 103.8 JWE23 185.0 JWK27 98.0 JWMI5 116.5
JXBl16 110.0 JXB26 116.5 JXC21 220.0 JXD0 1 138.6 JXDO8 138.6 JXD30 123.5
JXF ll 246.9 JXF29 103.8 JXGO5 138.6 JXM30 146.8 JXSO9 110.0 JXS14 146.8
JXS23 98.0 JXS39 146.8 JXZll 123.5 KABO3 185.0 KACO7 246.9 KAOO9 261.6
KASO9 233.1 KAS14 220.0 KCG23 246.9 KCG25 220.0 KDB23 220.0 KEP27 87.3












Table C-1. Continued


KEW22
KJMO8
KMC22
KWD22
LAIO4
LGK25
LLM22
LPN14
LXC01
LXSO5
MAT26
MCB20
MEW15
MJLO2
MLC23
MMM12
MPC21
MPS26
MRR22
MXS10
NGAl6
NML 15
ORS18
PDOll
PMC26
PTS01
RBCO9
RFC28
RJC24
RJZl6
RML 13
RTH15
RXG29
SAM25
SCH15
SEKO6
SHCO7
SLM27
SRB31
SXH10
TAR18
TNC14
TRF21
VMSO4
WDK13
WJFl5
WTGO7
JEC18


220.0
130.8
207.7
185.0
174.6
110.0
277.2
146.8
207.7
196.0
261.6
174.6
246.9
130.8
174.6
246.9
207.7
220.0
174.6
233.1
116.5
196.0
98.0
110.0
92.5
130.8
155.6
116.5
98.0
185.0
233.1
87.3
98.0
138.6
207.7
164.8
164.8
87.3
174.6
185.0
155.6
207.7
98.0
277.2
220.0
174.6
130.8
196.0


KGM22
KJS28
KMC27
KXA21
LAPO5
LGM 1
LMB18
LRD21
LXC11
MABO6
MAT 28
MCW14
MFC20
MJMO4
MLF13
MMR 1
MPF25
MRBl 1
MSM20
MYWO4
NJSO6
NMR29
OWHO4
PEEO9
PMD25
RABO8
RBDO3
RFH18
RJFl4
RL~M21
RMM13
RTL 17
RXMI5
SAR14
SECO2
SEM27
SHD17
SMD22
SRR24
SXM27
TCD26
TPMO4
TRS28
VMSO5
WDKl7
WJP20
WXEO4
TMD12


220.0
207.7
207.7
164.8
116.5
185.0
116.5
116.5
207.7
196.0
233.1
277.2
123.5
207.7
196.0
138.6
110.0
98.0
77.8
220.0
207.7
123.5
233.1
185.0
130.8
185.0
155.6
155.6
164.8
123.5
246.9
87.3
110.0
207.7
196.0
116.5
220.0
207.7
130.8
196.0
138.6
155.6
185.0
246.9
130.8
123.5
123.5
349.2


KJBl9
KJWO7
KMS29
KXBl7
LARO5
LHLO8
LMMO4
LRMO3
LXC28
MABl 1
MBMO5
MCW21
MGM28
MJZl8
MLG10
MMS29
MPH 12
MRB25
MWD28
MYW14
NLCO8
NMVO7
OWPO2
PFMO3
PMFO3
RAB22
RCC11
RFH19
RJF22
RMBO7
RPC 14
RWC23
RXPO2
SAV18
SEF10
SFD17
SHT20
SMKO4
SWBl4
SXS16
TESO3
TPP11
VFMll
VRS01
WDK47
WPB30
WXHO2
SMAO8


164.8
103.8
155.6
246.9
116.5
207.7
185.0
293.7
207.7
146.8
155.6
196.0
220.0
196.0
233.1
130.8
220.0
98.0
110.0
207.7
185.0
207.7
246.9
103.8
233.1
196.0
233.1
130.8
174.6
98.0
174.6
98.0
138.6
277.2
98.0
116.5
138.6
370.0
123.5
220.0
220.0
220.0
220.0
164.8
146.8
123.5
103.8
220.0


KJI23
KL~CO6
KMWO5
KXH19
LBA24
LJHO6
LMM17
LSBl8
LXD22
MACO3
MBM21
MECO6
MGV 1
MKL~31
MMD 1
MNHO4
MPSO9
MRB30
MXC 10
NAC21
NMB28
NXM18
PAM01
PGBl6
PSA21
RAE12
REC 19
RGE19
RJL28
RMCO7
RPJl5
RWFO6
RXS13
SBFll
SEGl8
SFD23
SJD28
SMK23
SWSO4
SXZ01
TLPl3
TPP24
VJVO2
WBR12
WFCO7
WPKll
WXS21
SHDO4


138.6
207.7
311.1
246.9
220.0
207.7
196.0
174.6
207.7
185.0
196.0
196.0
103.8
123.5
233.1
207.7
246.9
92.5
233.1
98.0
185.0
185.0
92.5
110.0
155.6
110.0
233.1
82.4
92.5
155.6
116.5
146.8
130.8
207.7
130.8
87.3
123.5
146.8
155.6
87.3
233.1
185.0
130.8
277.2
116.5
110.0
110.0
349.2


KJI24 130.8
KLD26 164.8
KPS25 103.8
LACO2 164.8
LCW30 196.0
LJM24 196.0
LMPl2 196.0
LVD28 261.6
LXGl7 116.5
MAMO8 207.7
MBM25 185.0
MEC28 174.6
MHL19 138.6
MLBl6 196.0
MMD15 233.1
MNH14 261.6
MPS21 233.1
MRC20 174.6
MXN24 233.1
NAP26 92.5
NMC22 233.1
NXRO8 185.0
PAT10 110.0
PJM12 98.0
PT0l8 98.0
RAM30 261.6
REW16 110.0
RHGO7 220.0
RJR15 110.0
RMC18 196.0
RPQ20 103.8
RWR14 110.0
SAC10 103.8
SBF24 207.7
SEH26 174.6
SFM22 92.5
SLC23 220.0
SMW17 77.8
SXCO2 146.8
TAB21 174.6
TLSO8 185.0
TPS16 116.5
VJVO9 110.0
WCB24 174.6
WJBO6 233.1
WSBO6 110.0
LMEO7 659.3
KXH30 174.6


KJnll 116.5
KMC19 207.7
KTJ26 220.0
LADI3 130.8
LDJll 82.4
LJS31 220.0
LNC11 98.0
LWR18 220.0
LXR15 103.8
MAM21 220.0
MCAO7 164.8
MEH26 196.0
MIDO8 174.6
MLCO8 233.1
MMG27 246.9
MPB23 103.8
MPS23 311.1
MRMI6 155.6
MXSO6 246.9
NFGO8 207.7
NMFO4 164.8
OAB28 69.3
PCL24 110.0
PLW14 207.7
PTO22 98.0
RAN30 261.6
RFC19 233.1
RHPl2 196.0
RJR29 116.5
RMFl4 196.0
RSM20 130.8
RWR16 116.5
SAE01 164.8
SCC15 138.6
SEH28 246.9
SGN18 138.6
SLGO5 196.0
SPM26 92.5
SXG23 174.6
TAC22 207.7
TMKO4 261.6
TRFO6 116.5
VMBl8 174.6
WDKO4 110.0
WJBl2 110.0
WST20 87.3
EAMO5 146.8
VAWO7 174.6










REFERENCES

American Standards Association (1960). "Acoustical Terminology SI 1-1960" (American
Standards Association, New York).

American National Standards Institute (1994). ANSI S1.1-1994, "American National Standard
Acoustical Terminology"' (Acoustical Society of America, New York).

Askenfelt, A. (1979). "Automatic notation of played music: the VISA proj ect, Fontes Artis
Musicae 26, 109-118.

Bagshaw, P. C., Hiller, S. M., and Jack, M. A. (1993). "Enhanced pitch tracking and the
processing ofFO contours for computer and intonation teaching," Proc. European Conf. on
Speech Comm. (Eurospeech), pp. 1003-1006.

Bagshaw, P. C., (1994), "Automatic prosodic analysis for computer aided pronunciation
teaching", doctoral dissertation.

Boersma, P. (1993). "Accurate short-term analysis of the fundamental frequency and the
harmonics-to-noise ratio of a sampled sound," Proc. Institute of Phonetic Sciences 17, 97-
110.

Dannenberg, R. B., Birmingham, W. P., Tzanetakis, G. P., Meek, C. P., Hu, N. P., and Pardo, B.,
P. (2004). "The MUSART Testbed for Query-by-Humming Evaluation," Comp. Mus. J.
28, 34-48.

de Boer, E. (1976). "On the 'residue' and auditory pitch perception," in Handbook of Sensory
Physiology, edited by W. D. Keidel and W. D. Neff (Springer-Verlag, New York), Vol.
V/3, 479-583.

De Bot, K. (1983). "Visual feedback of intonation I: Effectiveness and induced practice
behavior," Language and Speech 26, 331-350.

De Cheveigne, A., Kawahara, H. (2002). "YINT, a fundamental frequency estimator for speech
and music," J. Acoust. Soc. Am. 111, 1917-1930.

Di Martino, J., Laprie, Y. (1999): "An efficient FO determination algorithm based on the implicit
calculation of the autocorrelation of the temporal excitation signal," Proc. EUROSPEECH,
2773-2776.

Doughty, J., and Garner, W. (1947). "Pitch characteristics of short tones. I. Two kinds of pitch
threshold," J. Exp. Psychol. 37, 351-365

Duifhuis, H., Willems, L. F., and Sluyter, R. J. (1982). "Measurement of pitch in speech: an
implementation of Goldstein's theory of pitch perception," J. Acoust. Soc. Am. 71, 1568-
1580.










Fant, G. (1960). Acoustic theory of speech production, 0I ithr calculations based on X-ray studies
ofRussian articulations. (Mouton De Gruyter).

Fastl, H., and Stoll G. (1979). "Scaling of pitch strength," Hear. Res. 1, 293-301.

Fastl, H., and Zwicker, E. (2007). Psychoacoustics: Facts and2~odels (Springer, Berlin).

Galembo, A., Askenfelt, A., Cuddy, L. L., and Russo, F. A. (2001). "Effects of relative phases on
pitch and timbre in the piano bass range," J. Acoust. Soc. Am. 110, 1649-1666.

Glasberg, B. R., and Moore, B. C. J. (1990). "Derivation of auditory filter shapes from notched-
noise data," Hear. Res. 47, 103-138.

Hermes, D. J. (1988). "Measurement of pitch by subharmonic summation," J. Acoust. Soc. Am.
83, 257-264.

Houtgast, T. (1976). Subharmonic pitches of a pure tone at low S/N ratio," J. Acoust. Soc. Am.
60, 405-409.

Houtsma, A. J. M., and Smurzynski, J. (1990). "Pitch identification and discrimination for
complex tones with many harmonics," J. Acoust. Soc. Am. 87, 304-310.

Kawahara, H., Katayose, H., de Cheveigne, A., and Patterson, R. D. (1999). "'Fixed Point
Analysis of Frequency to Instantaneous Frequency Mapping for Accurate Estimation of FO
and Periodicity," Proc. EUROSPEECH 6, 2781-2784.

Medan, Y., Yair, E., and Chazan, D. (1991). Super resolution pitch determination of speech
signals," IEEE Trans. Acoust., Speech, Signal Process. 39, 40-48.

Moore, B. C. J. (1977). "'Effects of relative phase of the components on the pitch of three-
component complex tones," in Psychophysics and Physiology ofHearing, edited by E. F.
Evans and J. P. Wilson (Academic, London).

Moore, B. C. J. (1986). "Parallels between frequency selectivity measured psychophysically and
in cochlear mechanics," Scand. Audiol. Supplement 25, 139-152.

Moore, B. C. J. (1997). An Introduction to the Psychology ofHearing (Academic, London).

Murray, I. R., and Arnott, J., L. (1993). "Toward the simulation of emotion in synthetic speech:
A review of the literature of human vocal emotion," J. Acoust. Soc. Am. 93, 1097-1 108.

Noll, A. M. (1967). "Cepstrum pitch determination," J. Acoust. Soc. Am. 41, 293-309.

Oppenheim, A. V., Schafer, R. W. and Buck, J. R. (1999). Discrete-Time Signal Processing
(Prentice Hall, New Jersey).

Patel, A. D., and Balaban, E. (2001). "Human pitch perception is reflected in the timing of
stimulus-related cortical activity," Nature Neuroscience 4, 839-844.










Plante, F., Meyer, G., and Ainsworth, W. A. (1995). "A pitch extraction reference database,"
EUROSPEECH-1995, 837--840.

Rabiner, L.R. (1977), "On the Use of Autocorrelation Analysis for Pitch Detection," IEEE Trans.
Acoust., Speech, Signal Process. 25, 24-33.

Robinson, K.L. and Patterson, R.D. (1995a) "The duration required to identify the instrument,
the octave, or the pitch-chroma of a musical note," Music Perception 13, 1-15.

Robinson, K.L. and Patterson, R.D. (1995b) "The stimulus duration required to identify vowels,
their octave, and their pitch-chroma," J. Acoust. Soc. Am. 98, 1858-1865.

Schroeder, M. R. (1968). "Period histogram and product spectrum: new methods for fundamental
frequency measurement," J. Acoust. Soc. Am. 43, 829-834.

Schwartz, D. A., and Purves, D. (2004). "Pitch is determined by naturally occurring periodic
sources," Hear. Res. 194, 31-46.

Secrest, B., and Doddington, G. (1983) "An integrated pitch tracking algorithm for speech
systems," Proc. ICASSP-83, pp. 1352-1355.

Sethares, W. A. (1998). Tuning, Timbre, Spectrum, Scale (Springer, London).

Shackleton, T. M. and Carlyon, R. P. (1994). "The role of resolved and unresolved harmonics in
pitch perception and frequency modulation discrimination," J. Acoust. Soc. Am. 95, 3529-
3540.

Shofner, W. P., and Selas, G. (2002), "Pitch strength and Stevens's power law," Percept
Psychophys. 64, 437-450.

Sondhi, M. M. (1968), "New Methods of Pitch Extraction," IEEE Trans. Audio and
Electroacoustics AU-16, 262-266.

Spanias, A. S. (1994). "Speech coding: a tutorial review," Proc. IEEE, 82, 1541-1582.

Stevens, S.S. (1935), "The relation of pitch to intensity," J. Acoust. Soc. Am. 6, 150-154.

Sun, X. (2000). "A pitch determination algorithm based on subharmonic-to-harmonic ratio,"
Proc. Int. Conf. Spoken Language Process. 4, 676-679.

Takeshima, H., Suzuki, Y., Ozawa, K., Kumagai, M., and Sone, T. (2003). "Comparison of
loudness functions suitable for drawing equal-loudness level contours," Acoust. Sci. Tech.
24, 61-68.

Verschuure, J., van Meeteren A.A. (1975). "The effect of intensity on pitch," Acustica 32, 33-
44.

von Helmholtz, H. (1863). On the Sensations of Tone as a Physiological Basis for the Theory of
Music (Kessinger Publishing).










Wang, M. and Lin, M. (2004). "An Analysis of Pitch in Chinese Spontaneous Speech," Int.
Symp. on Tonal Aspects of Tone Languages, Beijing, China.

Wiegrebe, L., and Patterson, R. D. (1998). "Temporal dynamics of pitch strength in regular
interval noises," J. Acoust. Soc. Am. 104, 2307-2313.

Huang, X. Acero, A., and Hon, H. W. (2001) Spoken Language Processing: A Guide to Theory,
Algat 1 within, and System Development (Prentice Hall, New Jersey).

Yost, W. A. (1996). "Pitch strength of iterated rippled noise," J. Acoust. Soc. Am. 100, 3329-
3335.









BIOGRAPHICAL SKETCH

Arturo Camacho was born in San Jose, Costa Rica, on October 21, 1972. He did his

elementary school at Centro Educativo Roberto Cantillano Vindas and his high school at Liceo

Salvador Umafia Castro. After that, he studied Music at the Universidad Nacional, and at the

same time he performed as pianist in some of the most popular Costa Rican Latin music bands in

the 1990's. He also studied Computer and Information Science at the Universidad de Costa Rica,

where he obtained his B.S. degree in 2001. He worked for a short time as a software engineer in

Banco Central de Costa Rica during that year, but soon he moved to the United States to pursue

graduate studies in Computer Engineering at the University of Florida. He received his M. S. and

Ph.D. degrees in 2003 and 2007, respectively.

Arturo's research interests span all areas of automatic music analysis, from the lowest level

tasks like pitch estimation and timbre identification, to the highest levels tasks like analysis of

harmony and gender. His dream is to have one day a computer program that allows him (and

everyone) to analyze music as well or better than a well-trained musician would do.

Currently, Arturo lives happily with his loved wife Alexandra, who is another Ph. D. gator

in Computer Engineering and who he married in 2002, and their loved daughter Melissa, who

was born in 2006.





PAGE 1

1 SWIPE: A SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR FOR SPEECH AND MUSIC By ARTURO CAMACHO A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2007

PAGE 2

2 2007 Arturo Camacho

PAGE 3

3 Dedico esta disertaci n a mis queridos abuelos Hugo y Flory

PAGE 4

4 ACKNOWLEDGMENTS I thank my grandparents for all the support they have given to me during my life, my wife for her support during the years in graduate scho ol, and my daughter who was in my arms when the most important ideas expressed here came to my mind. I also thank Dr. John Harris for his guidance during my research and for always pushi ng me to do more, Dr. Rahul Shrivastav for his support and for introducing me to the world of auditory system models, and Dr. Manuel Bermudez for his unconditional support all these years.

PAGE 5

5 TABLE OF CONTENTS pageitch Background........................................................................................................... ...14 1.1.1 Conceptual Definition............................................................................................14 1.1.2 Operational Definition............................................................................................15 1.1.3 Strength................................................................................................................. ..16 1.1.4 Duration Threshold.................................................................................................19 1.2 Illustrative Examples and P itch Determination Hypotheses............................................20 1.1.2 Pure Tone................................................................................................................ 20 1.2.2 Sawtooth Waveform and the Largest Peak Hypothesis.........................................21 1.2.3 Missing Fundamental and the Components Spacing Hypothesis...........................21 1.2.4 Square Wave and the Maximu m Common Divisor Hypothesis............................22 1.2.5 Alternating Pulse Train...........................................................................................24 1.2.6 Inharmonic Signals.................................................................................................25 1.3 Loudness................................................................................................................... ........26 1.4 Equivalent Rectangular Bandwidth..................................................................................27 1.5 Dissertation Organization.................................................................................................2 9 1.6 Summary.................................................................................................................... .......30 2 PITCH ESTIMATION ALGORITHMS : PROBLEMS AND SOLUTIONS........................31 2.1 Harmonic Product Spectrum (HPS)..................................................................................32 2.2 Sub-harmonic Summation (SHS).....................................................................................34 2.3 Subharmonic to Harmonic Ratio (SHR)...........................................................................36 2.4 Harmonic Sieve (HS)........................................................................................................ 37 2.5 Autocorrelation (AC)....................................................................................................... .39 2.6 Average Magnitude and Squared Di fference Functions (AMDF, ASDF).......................42 2.7 Cepstrum (CEP)............................................................................................................. ...44 2.8 Summary.................................................................................................................... .......46

PAGE 6

6 3 THE SAWTOOTH WAVEFORM IN SPIRED PITCH ESTIMATOR.................................47 3.1 Initial Approach: Average Peak -to-Valley Distance Measurement.................................47 3.2 Blurring of the Harmonics................................................................................................49 3.3 Warping of the Spectrum..................................................................................................51 3.4 Weighting of the Harmonics.............................................................................................53 3.5 Number of Harmonics......................................................................................................55 3.6 Warping of the Frequency Scale.......................................................................................55 3.7 Window Type and Size.....................................................................................................57 3.8 SWIPE...................................................................................................................... ........63 3.9 SWIPE .............................................................................................................................65 3.9.1 Pitch Strength of a Sawtooth Waveform................................................................69 3.10 Reducing Computational Cost........................................................................................71 3.10.1 Reducing the Number of Fourier Transforms......................................................71 3.10.1.1 Reducing window overlap..........................................................................72 3.12.1.2 Using only power-of-two window sizes.....................................................74 3.10.2 Reducing the Number of Spec tral Integral Transforms.......................................81 3.11 Summary................................................................................................................... ......86 4 EVALUATION................................................................................................................... ...87 4.1 Algorithms................................................................................................................. .......87 4.2 Databases.................................................................................................................. ........88 4.3 Methodology................................................................................................................ .....89 4.4 Results.................................................................................................................... ...........89 4.5 Discussion................................................................................................................. ........95 5 CONCLUSION................................................................................................................... ....97 APPENDIX A MATLAB IMPLEMENTATION OF SWIPE .......................................................................99 B DETAILS OF THE EVALUATION....................................................................................102 B.1 Databases.................................................................................................................. .....102 B.1.1 Paul Bagshaw’s Database....................................................................................102 B.1.2 Keele Pitch Database...........................................................................................102 B.1.3 Disordered Voice Database.................................................................................103 B.1.4 Musical Instruments Database.............................................................................104 B.2 Evaluation Using Speech...............................................................................................105 B.3 Evaluation Using Musical Instruments..........................................................................108 C GROUND TRUTH PITCH FOR THE DISORDERED VOICE DATABASE...................110 REFERENCES..................................................................................................................... .......112 BIOGRAPHICAL SKETCH.......................................................................................................116

PAGE 7

7 LIST OF TABLES Table page 3-1 Common windows used in signal processing....................................................................62 4-1 Gross error rates for speech............................................................................................... 90 4-2 Proportion of overestimation errors relative to total gross errors......................................90 4-3 Gross error rates by gender................................................................................................ 91 4-4 Gross error rates for musical instruments..........................................................................92 4-5 Gross error rates by instrument family..............................................................................92 4-6 Gross error rates for mu sical instruments by octave..........................................................93 4-7 Gross error rates for musical instruments by dynamic......................................................94 4-8 Gross error rates for variations of SWIPE ........................................................................95 C-1 Ground truth pitch values for the disordered voice database...........................................110

PAGE 8

8 LIST OF FIGURES Figure page 1-1 Sawtooth waveform.......................................................................................................... .18 1-2 Pure tone.................................................................................................................. ..........20 1-3 Missing fundamental........................................................................................................ ..22 1-4 Square wave................................................................................................................ .......23 1-5 Pulse train................................................................................................................ ...........24 1-6 Alternating pulse train.................................................................................................... ....25 1-7 Inharmonic signal.......................................................................................................... .....26 1-8 Equivalent rectangular bandwidth.....................................................................................28 1-9 Equivalent-rectangular-bandwidth scale............................................................................29 2-2 Harmonic product spectrum...............................................................................................33 2-3 Subharmonic summation...................................................................................................34 2-4 Subharmonic summation with decay.................................................................................35 2-5 Subharmonic to harmonic ratio..........................................................................................37 2-6 Harmonic sieve............................................................................................................. .....38 2-7 Autocorrelation............................................................................................................ ......40 2-8 Comparison between AC, BAC, ASDF, and AMDF........................................................42 2-9 Cepstrum................................................................................................................... .........44 2-10 Problem caused to cepstru m by cosine lobe at DC............................................................45 3-1 Average-peak-to-valley-distance kernel............................................................................48 3-3 Necessity of strictly convex kernels..................................................................................50 3-4 Kernels formed from concatenations of truncated squarings, Gaussians, and cosines......51 3-5 Warping of the spectrum.................................................................................................... 52 3-6 Weighting of the harmonics...............................................................................................54

PAGE 9

9 3-7 Fourier transform of rectangular window..........................................................................58 3-8 Cosine lobe and square-root of the spectrum of rectangular window...............................59 3-9 Hann window................................................................................................................ .....60 3-10 Fourier transform of the Hann window.............................................................................61 3-11 Cosine lobe and square-root of the spectrum of Hann window.........................................61 3-12 SWIPE kernel.............................................................................................................. .......64 3-13 Most common pitch estimation errors...............................................................................66 3-14 SWIPE kernel....................................................................................................................69 3-15 Pitch strength of sawtooth waveform................................................................................70 3-16 Windows overlapping....................................................................................................... .73 3-17 Idealized spectral lobes.................................................................................................. ....75 3-18 K+-normalized inner product between temp late and idealized spectral lobes...................77 3-19 Individual and combined pitch strength curves.................................................................78 3-20 Pitch strength loss when using suboptimal window sizes.................................................79 3-21 Coefficients of the pitch st rength interpola tion polynomial..............................................84 3-22 Interpolated pitch strength............................................................................................... ..85

PAGE 10

10 LIST OF OBJECTS Object page Object 1-1. Sawtooth waveform (WAV file, 32 KB)...................................................................18 Object 1-2. Pure tone (WAV file, 32 KB)....................................................................................20 Object 1-3. Missing fundam ental (WAV file, 32 KB)..................................................................22 Object 1-4. Square wave (WAV file, 32 KB)...............................................................................23 Object 1-5. Pulse trai n (WAV file, 32 KB)..................................................................................24 Object 1-6. Alternating pulse train (WAV file, 32 KB)...............................................................25 Object 1-7. Inharmonic si gnal (WAV file, 32 KB)......................................................................26 Object 2-1. Bandpass filter ed /u/ (WAV file 6 KB).....................................................................33 Object 2-2. Signal with strong se cond harmonic (WAV file, 32 KB)..........................................42 Object 3-1. Beating tones (WAV file, 32 KB)..............................................................................50

PAGE 11

11 LIST OF ABBREVIATIONS AC Autocorrelation AMDF Average magnitude difference function APVD Average peak-to-valley distance ASDF Average squared difference function BAC Biased autocorrelation CEP Cepstrum ERB Equivalent rectangular bandwidth ERBs Equivalent-rectangular-bandwidth scale FFT Fast Fourier transform HPS Harmonic product spectrum HS Harmonic sieve IP Inner product IT Integral transform ISL Idealized spectral lobe K+-NIP K+-normalized inner product NIP Normalized inner product O-WS Optimal window size P2-WS Power-of-two window size SHS Subharmonic-summation SHR Subharmonic-to-harmonic ratio STFT Short-time Fourier transform SWIPE Sawtooth Waveform Inspired Pitch Estimator UAC Unbiased autocorrelation WS Window size

PAGE 12

12 Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SWIPE: A SAWTOOTH WAVEFORM INSPIRED PITCH ESTIMATOR FOR SPEECH AND MUSIC By Arturo Camacho December 2007 Chair: John G. Harris Major: Computer Engineering A Sawtooth Waveform Inspired Pitch Esti mator (SWIPE) has been developed for processing speech and music. SWIPE is shown to outperform existing algorithms on several publicly available speech/musical-instruments databases and a disordered speech database. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. A decaying cosine kernel provides an extension to older frequency-based, sieve-type estimation algorithms by providing smooth peaks with decaying amplitudes to correlate with the harmonics of the signal. An improvement on the algorithm is achieved by using only the first a nd prime harmonics, which significantly reduces subharmonic errors commonly found in other pitch estim ation algorithms.

PAGE 13

13 CHAPTER 1 INTRODUCTION Pitch is an important characteristic of sound, providing information about the sound’s source. In speech, pitch helps to identify the gender of the speaker (pitch tends to be higher for females than for males) (Wang and Lin, 2004), gives additional meaning to words (e.g., a group of words can be interpreted as a question depend ing on whether the pitch is rising or not), and may help to identify the emotional state of th e speaker (e.g., joy produces high pitch and a wide pitch range, while sadness produce normal to low pitch and a narrow pitch range) (Murray and Arnott, 1993). Pitch is also important in music because it determines the names of the notes (Sethares, 1998). Pitch estimation also has appl ications in many areas that involve processing of sound: music, communications, linguistics, and speech pathology. In music, one of the main applications of pitch estimation is automatic mu sic transcription. Musicologists are often faced with music for which no transcription exists. Therefore, automated tools that extract the pitch of a melody, and from there the individual musical notes, are invaluable t ools for musicologists (Askenfelt, 1979). Automated transcription has also been used in query-by-humming systems (e.g., Dannenberg et al ., 2004). These systems allow people to search for music in databases by singing or humming the melody rather than ty ping the title of the song, which may be unknown for the user or the database. In communications, pitch estimation is us ed for speech coding (Spanias, 1994). Many speech coding systems are based on the source-f ilter model (Fant, 1960), which models speech as a filtered source signal. In some implementations, the source is either a periodic sequence of glottal pulses (for voiced sound) or white no ise (for unvoiced sound). Therefore, the correct estimation of the glottal pulse rate is cr ucial for the correct coding of voiced speech.

PAGE 14

14 Pitch estimators are useful in linguistics for the recognition of intonation patterns, which are used, for example, in the acquisition of a second language (de Bot, 1983). Pitch estimators are also used in speech pathology to determine speech disorders, which are characterized by high levels of noise in the voice. Since most methods used to es timate noise are based on the fundamental frequency of the signal (e.g., Yomoto and Gould, 1982), pitch es timators are of vital importance in this area. The goal of our work is to develop an auto matic pitch estimator that operates on both speech and music. The algorithm should be comp etitive with the best known pitch estimators, and therefore be suitable for the many applicat ions mentioned above. Furthermore, the algorithm should provide a measure to determine if a pitch exists or not in each region of the signal. The remaining sections of this chapter present se veral psychoacoustics definitions and phenomena that will be used to explain the oper ation and rationale of the algorithm. 1.1 Pitch Background 1.1.1 Conceptual Definition Several conceptual definitions of pitch ha ve been proposed. The American Standard Association (ASA, 1960) de finition of pitch is “Pitch is that attribute of a uditory sensation in terms of which sounds may be ordered on a musical scale,” and the American National Standards Instit ute (ANSI, 1994) definition of pitch is “Pitch is that auditory attribute of sound acco rding to which sounds can be ordered on a scale from low to high. Pitch depends main ly on the frequency content of the sound stimulus, but it also depends on the sound pre ssure and the waveform of the stimulus.” These definitions mention an attr ibute that allows ordering s ounds in a scale, but they say nothing about what th at attribute is.

PAGE 15

15 We will propose another definition of pitch, wh ich is based on the fundamental frequency of a signal. The f undamental frequency f0 of a signal (sound or no sound ) exists only for periodic signals, and is defined as the inverse of the period of the signal, where the period T0 of the signal (a.k.a. fundamental period) is the mini mum repetition interval of the signal x ( t ), i.e., ) ( ) ( : | 0 min0T t x t x t T T (1-1) It is also possible, to define the funda mental frequency in the frequency domain: 0 0) 2 sin( ) ( : } { }, { | 0 maxk k k k kkft c t x c f f (1-2) Although both equations are ma thematically equivalent (i .e., it can be shown that f0 = 1/ T0), they are conceptually different: Equation 1-1 looks at the signal in the time domain, while Equation 1-2 looks at the signal as a combination of sinusoids using a Four ier series expansion. The key element for periodic ity in Equation 1-1 is the equality in x ( t ) = x ( t + T ), and the key element for periodicity in Equation 1-2 is the existence of components only at multiples of the fundamental frequency. Unfortunately, no signal in nature is perfectly periodic because of natural variations in frequency and amplitude, and contamination produced by noise. Nevertheless, when listening to many natural signals we perceive pitch. Th is suggests that, to determine pitch, the brain probabl y uses either a modified vers ion of Equation 1-1, where the equality x ( t ) = x ( t + T ) is substituted by an approximation, or a modified version of Equation 1-2, where noise and fluctuations in the frequenc y of the components are allowed. Based on this suggestion, we define pitch as the perceived “fundamental freque ncy” of a sound, in other words, as the estimate our brain does of the ( quasi) fundamental frequency of a sound. 1.1.2 Operational Definition Since the previous definitions of pitch do not indicate how to measure it, they are of no practical use, and an operational definition of pi tch is required. The usual way in which pitch is

PAGE 16

16 measured is the following. A liste ner is presented with two sounds : a target sound, for which the pitch is to be determined, and a matching s ound. The matching sound is usually a pure tone, although sometimes harmonic complex tones are used as well. The levels of the target and the matching sounds are usually equalized to avoid any effect of differen ces in level in the perception of pitch. The sounds are presente d sequentially, simultane ously, or in any combination of them, depending on the design of the experiment. The listener is asked to adjust the fundamental frequency of the matching sound such that it matches the target sound, in the sense of the conceptual definitions of pitch pr esented above. The fundamental frequency of the matching sound is recorded and the experiment is repeated several times and with different listeners. The data is summarize d, and if the distribution of f undamental frequencies shows a clear peak around a certain frequency, the target sound is said to have a pitch corresponding to that frequency. 1.1.3 Strength Some sounds elicit a strong pitch sensation, and some do not. For example, when we speak, some sounds are highly periodic and elicit a strong pitch sensation (e.g., vowels), but some do not (e.g., some consonants: / s /, / sh /, / p /, and / k /). In the case of musical instruments, the attack tends to contain transien t components that obscu re the pitch, but they disappear quickly letting the pitch show more clear ly. The quality of the sound that allows us to determine whether pitch exists is called pitch strength Pitch strength is not a categor ical variable but a continuum. Also, pitch strength is independe nt of pitch: two sounds may have the same pitch and differ in pitch strength. For example, a pure tone and a na rrow band of noise centered at the frequency of the tone have the same pitch, however, the pure tone elicit a stronger p itch sensation than the noise.

PAGE 17

17 Unfortunately, not much research exists on pi tch strength, and the fe w studies that exist have concentrated mostly on noise (Yost, 1996; Wiegrebe and Patterson, 1998), although some have explored harmonic sounds as well (Fastl and St oll, 1979; Shofner and Selas, 2002). In terms of variety of sounds, the most co mplete study is probably Fastl a nd Stoll’s, which included pure tones, complex tones, and several types of noises. In th at study, pure tones were reported to have the strongest pitch among all s ounds. However, other studies have found that pitch identification improves as harmonics are added (Houtsma, 1990) which suggests that pitch strength increases as well. We hypothesize that our brain determines p itch by searching for a match between our voice, produced or imagined, and the target signal for which pitch is to be determined, probably based on their spectra. This hypothe sis agrees with st udies of pitch determination in which subjects have been allowed to hum the target sound to facilitate pitch matching tasks (Houtgast, 1976). Based on this hypothesis, we believe that the higher the similarity of the target signal with our voice, the higher its pitch stre ngth. If the similarity is base d on the spectrum, a signal will have maximum pitch strength when its spectrum is closest to the spectrum of a voiced sound. If we assume that voiced sounds have harmonic spect ra with envelopes that decay on average as 1/ f (i.e., inversely proportional to frequency) (Fan t, 1960), then a signal will have maximum pitch strength if its spectrum has that structure. An example of a signal with such property is a sawtooth waveform, which is exemplified in Figure 1-1. A sawtooth waveform is formed by adding sines with frequencies that are multiples of a common fundamental f0, and whose amplitude decays inversely proportional to frequency: 1 02 sin 1 ) (kt kf k t x. (1-3)

PAGE 18

18 Though sawtooth waveforms play a key role in our research, their importance resides in their spectrum, and not in their time-domain waveform. In particul ar, the phase of the components can be manipulated (destroying its sa wtooth waveform shape) and the signal would still play the same role in our work as the sawt ooth waveform. In other words, it is assumed in this work that what matters to estimate pitch and its strength is the amplitude of the spectral components of the sound, and not their phase, whic h in fact is ignored here. However, phase does play a role in pitch perception, as have been shown by some researchers (Moore, 1977; Shackleton and Carlyon, 1994; Gale mbo, et al., 2001). These resear chers have created pairs of sounds that have the same spectral amplitudes but significantly different pitches, by choosing the phases of the components appropriately. Nevertheless, it is not the aim of this research to cover the whole range of pitch phenomena, but to con centrate only on the most common speech and Figure 1-1. Sawtooth waveform A) Signal. B) Spectrum. Object 1-1. Sawtooth wa veform (WAV file, 32 KB).

PAGE 19

19 musical instruments sounds. As we will see late r, good pitch predictions are obtained for these types of sounds based solely on the amp litude of their spectral components. 1.1.4 Duration Threshold Doughty and Garner (1947) studied the minimum duration required to perceive a pitch for a pure tone. They found that ther e are two duration thresholds w ith two different properties. Tones with durations below the shorter threshol d are perceived as a click, and no pitch is perceived. Tones with durations between the two thresholds are perceived as having pitch, and an increase in their duration causes an increase in their pitch strength. Tones with durations above the largest threshold are al so perceived as having pitch, but further increases in their duration do not increase their pitch strength. These thresholds are not consta nt, but approximately proportiona l to the pitch period of the tone. In other words, the threshold corresponds to a certain number of periods of the tone. However, there is some interaction between pitc h and the minimum number of cycles required to perceive it (lower frequencies have a tendency to require fewer cycles to elicit a pitch). The shorter threshold is appr oximately two to four cycles, and th e larger threshold is approximately three to ten cycles. For frequencies above 1 kH z the thresholds become constant: 4 ms the shorter and 10 ms the larger, regardless of their corresponding number of cycles. Robinson and Patterson (1995a; 1995b) studied note discriminability as a function of the number of cycles in the sound us ing strings, brass, flutes, and ha rpsichords. A large increase in discriminability can be observed in their data as the number of cycles increases from one to about ten, but beyond ten cycles the discriminab ility of the notes does not seem to increase much. This trend agrees with th e thresholds for pure tones men tioned above, which suggests that the thresholds are also valid for musical inst ruments, and probably for sawtooth waveforms as well.

PAGE 20

20 1.2 Illustrative Examples and Pitch Determination Hypotheses In previous sections, conceptual and operational de finitions of pitch were given. From a practical point of view, both types of definitions are of lim ited use since the conceptual definitions are too abstract and the operational definiti on requires a human to determine the pitch. In this section we propose more algorithmic ways for determining pitch, through the search for cues that may give us hints regarding the pitch. Th ese cues, hereafter referred as hypotheses, are illustrated with examples of sounds in which they are va lid, and examples in which they are not. 1.1.2 Pure Tone From a frequency domain point of view, the si mplest periodic sound is a pure tone. A pure tone with a frequency of 100 Hz and its spec trum is shown in Figure 1-2. Based on our operational definition of pitch (i.e., the one that uses a pure tone as matching tone presented at the same intensity level as the testing tone), the pitch of a pure tone is its frequency, and Figure 1-2. Pure tone. A) Signal. B) Spectrum. Object 1-2. Pure tone (WAV file, 32 KB).

PAGE 21

21 therefore frequency determines pitch in this case. This may not be true if the tones are presented at different levels. Intriguingly, the pitch of a pure tone may change with intensity level (Stevens, 1935): as intensity increases, the pi tch of high frequency tones tends to increase, and the pitch of low frequency tones tends to decrease. However, this change is usually less than 1% or 2% (Verschuure and van Meeteren, 1975) occurs at very disparate intensity levels, and varies significantly from person to person. Since the goal in this research is to predict pitch for sounds represented in a computer as a sequence of numbers without knowing the level at which the sound will be played, it will be assumed that the sound will be played at a “comfor table” level, and therefore the algorithm will be designed to predict the pitch at that level. Nevertheless, variations of pitch with level are small, and have little effect even for comp lex tones (Fastl, 2007), otherwise, music would become out of tune as we change the volume. 1.2.2 Sawtooth Waveform and the Largest Peak Hypothesis The sawtooth waveform presented in Sect ion 1.1.3 was shown to have a harmonic spectrum with components whose amplitude decay s inversely proportional to frequency (see Figure 1-1). The computational dete rmination of the pitch of a sawt ooth waveform is not as easy as it is for a pure tone because its spectrum has more than one component. Since the pitch of a sawtooth waveform corresponds to its fundamental frequency, and the fundamental frequency in this case is the component with the highest en ergy, one possible hypothesis for the derivation of the pitch is that the pitch corres ponds to the largest peak in the spectrum. However, as we will show in the next section, this hypothesis does not always hold. 1.2.3 Missing Fundamental and the Components Spacing Hypothesis This section shows that it is possible to crea te a periodic sound w ith a pitch corresponding to a frequency at which there is no energy in the spectrum. A sound w ith such property is said to

PAGE 22

22 Figure 1-3. Missing fundamenta l. A) Signal. B) Spectrum. Object 1-3. Missing fundamental (WAV file, 32 KB). have a missing fundamental It is easy to build such a signal: just take a sawtooth waveform and remove its fundamental, as shown in Figure 1-3. Certainly, the timbre of the sound will change, but not its pitch. This fact disproves the hypothesis that the pitch corresponds to the largest peak in the spectrum. After it was realized that the pitch of a complex tone was unaffected by removing the fundamental frequency, it was hypothesized that the pitch corr esponds to the spacing of the frequency components. However, this hypothesis is not always valid, as we will show in the next section. 1.2.4 Square Wave and the Maximum Common Divisor Hypothesis The previous section hypothesized that the pitch corresponds to the spacing between the frequency components. However, it is easy to find an example for which this hypothesis fails: a

PAGE 23

23 Figure 1-4. Square wave. A) Signal. B) Spectrum. Object 1-4. Square wave (WAV file, 32 KB). square wave. A square wave is similar to a sawtooth wave, but does not have even order harmonics: 1 0) 1 2 ( 2 sin ) 1 2 ( 1 ) (kt f k k t x. (1-4) A square wave with a fundamental frequenc y of 100 Hz and its spectrum is shown in Figure 1-4. The components are located at odd multiples of 100 Hz, producing a spacing of 200 Hz between them. However, the fundamental freque ncy, and indeed its pi tch, is 100 Hz. Thus, the components spacing hypothesis is invalid. A hypothesis that seems to work for this exampl e, and all the previous ones, is that the pitch must correspond to the maximum common divi sor of the frequency components. As shown in Equation 1-2, this is equivalent to saying that the pitch correspo nds to the fundamental frequency. However, we will show in the next section that this hypothesis is also wrong.

PAGE 24

24 1.2.5 Alternating Pulse Train A pulse train is a sum of pulses se parated by a constant time interval T0: 1 0) ( ) (kkT t t x, (1-5) where is the delta or pulse function, a function w hose value is one if its argument is zero, and zero otherwise. A pulse train with a fundamental frequency of 100 Hz (fundamental period of 10 ms) and its spectrum are shown in Figure 1-5. The spectrum of a pulse train is another pulse train with pulses at multiples of the fundamental freque ncy, which corresponds to the pitch. If the signal is modified by decreasing the height of ever y other pulse in the time domain to 0.7, as shown in Figure 1-6, the period of the signal will change to 20 ms. This will be reflected in the spectrum as a change in the fundamental fre quency from 100 Hz to 50 Hz. However, although this change may cause an effect on the timbre (dep ending on the overall level of the signal), the Figure 1-5. Pulse train. A) Signal. B) Spectrum. Object 1-5. Pulse tr ain (WAV file, 32 KB).

PAGE 25

25 Figure 1-6. Alternating pulse tr ain. A) Signal. B) Spectrum. Object 1-6. Alternating pul se train (WAV file, 32 KB). pitch will remain the same: 100 Hz, refuting the hypothesis that the pitch of a sound corresponds to its fundamental frequency (i.e., the maximu m common divisor of the frequency components). 1.2.6 Inharmonic Signals This section shows another example of a signal whose pitch does not correspond to its fundamental frequency (i.e., the maximum co mmon divisor of its frequency components). Consider a signal built from the 13th, 19th, and 25th harmonics of 50 Hz (i.e., 650, 950, and 1250 Hz), as shown in Figure 1-7. Its fundamental frequency is 50 Hz, but its pitch is 334 Hz (Patel and Balaban, 2001). This is interesting since the ratios between the components and the pitch are far from being integer multiples: 1.95, 2.84, and 3.74. In any case, the pitch of the signal no longer corresponds to it s fundamental frequency. Although the true period of the signal is T0 = 20 ms, the signal peaks about every 3 ms, whic h corresponds to the pi tch period of the

PAGE 26

26 Figure 1-7. Inharmonic signal. A) Signal. T0 corresponds to the fundamental period of the signal and t0 corresponds to the pitc h period. B) Spectrum. Object 1-7. Inharmonic signal (WAV file, 32 KB). signal t0 (see Panel A). These type of signals for which the components are not integer multiples of the pitch are called inharmonic signals 1.3 Loudness Loudness is another perceptual qu ality of sound that provides us with information about its source. It is important for pi tch because the unification of th e components of a sound into a single entity, for which we identify a pitch, ma y be mediated by the relative loudness of the components of the sound. A conceptual definition of loudness is (Moore, 1997) “…that attribute of auditory sensation in terms of which s ounds can be ordered on a scale extending from quiet to loud.”

PAGE 27

27 The most common unit to measure loudness is the sone A sone is defined as the loudness elicited by a 1 kHz tone presented at 40 dB sound pressure level. The loudness L of a pure tone is usually modeled as a power function of the sound pressure P of the tone, i.e., P k L (1-6) where k is a constant that depends on the units and is the exponent of the power law. In a review of loudness studies, Takeshima et. al (2003) found that the value of is usually reported to be within th e range 0.4-0.6. They also review ed more elaborate models with many more parameters, but for simplicity, in this work we will use the simpler power model, and for reasons we will explain later, we choose the value of to be 0.5. In other words, we model the loudness of a tone as being proportio nal to the square-root of its amplitude. 1.4 Equivalent Rectangular Bandwidth The bandwidth and the distributi on of the filters used to extr act the spectral components of a sound are important issues that may affect our perception of pitch. Since each point of the cochlea responds better to cert ain frequencies than others, th e cochlea acts as a spectrum analyzer. The bandwidth of the frequency response of each point of the cochlea is not constant but varies with frequency, being almost propor tional to the frequency of maximum response at each point (Glasberg and Moore, 1990). The concept of Equivalent R ectangular Bandwidth (ERB) was introduced as a description of the spread of the frequency response of a filter. The ERB of a filter F is defined as the bandwidth (in Hertz) of a rectangular filter R centered at the frequenc y of maximum response of F scaled to have the same output as F at that frequency, and passi ng the overall same amount of white noise energy as F In other words, when the power responses of F and R are plotted as a

PAGE 28

28 Figure 1-8. Equivalent rectangular bandwidth. function of frequency, as in Figu re 1-8, the central frequency of R corresponds to the mode of F and both curves have the same height and area. Glasberg and Moore (1990) studied the response of auditory filters at different frequencies, and proposed the following formula to approximate the ERB of the filters: f f 108 0 7 24 ) ( ERB (1-7) Another property of the cochlea is that th e relation between frequency and site of maximum excitation in the cochlea is not linear. If the distance between the apex of the cochlea and the site of maximum excitati on of a pure tone is plotted as a function of frequency of the tone, it will be found that a disp lacement of 0.9 mm in the coch lea corresponds approximately to one ERB (Moore, 1986). Therefore, it is possibl e to build a scale to measure the position of maximum response in the cochlea for a certain frequency f by integrating Equation 1-7 to obtain the number of ERBs below f and then multiplying it by 0.9 mm to obtain the position. However,

PAGE 29

29 Figure 1-9. Equivalent-rect angular-bandwidth scale. it is common practice in psychoacoustics to merely compute the number of ERBs below f which can be computed as ) 229 / 1 ( log 4 21 ) ( ERBs10f f (1-8) This scale is shown in Figure 19, and it will be th e scale used by SWIPE to compute spectral similarity. 1.5 Dissertation Organization The rest of this dissertation is organized as follows. Chapter 2 presents previous pitch estimation algorithms that are related to SWIPE, their problems and possible solutions to these problems. Chapter 3 will discuss how these proble ms, plus some ones and their solutions, lead to SWIPE. Chapter 4 evaluates SWIPE using publicly available speech/music databases and a disordered speech database. Publicly available implementations of other algorithms are also evaluated on the same databa ses, and their performance is compared against SWIPE’s.

PAGE 30

30 1.6 Summary Here we have presented the motivations and applications for pitch estimation. Then, we presented conceptual and operational definitions of pitch, together with the related concept of pitch strength and the duration threshold to perc eive pitch. Next, we presented examples of signals and their pitch, together with hypothese s about how pitch is determined. The sawtooth waveform was highlighted, since it plays a key role in the development of SWIPE. Psychoacoustic concepts such as inharmonic signals, loudness, and the ERB scale were also introduced since they are also releva nt for the development of SWIPE.

PAGE 31

31 CHAPTER 2 PITCH ESTIMATION ALGORITHMS : PROBLEMS AND SOLUTIONS This chapter presents some well known pitch estimation algorithms that appear in the literature. These algorithms were chosen because of their influence upon the creation of SWIPE. We will present the algorithms in a very basic form with the intent to ca pture their essence in a simple expression, although their actual implementations may have extra details that we do not present here. The purpose of those details is usua lly to fine tune the algorithms, but the actual power of the algorithms is based on the essence we describe here. Figure 2-1. General block di agram of pitch estimators.

PAGE 32

32 Traditionally, there have been two types of pitch estimation algorithms (PEAs): algorithms based on the spectrum1 of the signal, and algorithms based on the time-domain representation of the signal. The time-domain based algorithms presented in this chapter can also be formulated based on the spectrum of the signal, which will be the approach followed here. The basic steps that most PEAs perform to track the pitch of a signal are shown in the block diagram of Figure 2-1. First, the signal is split into windows. Th en, for each window the following steps are performed: (i) the spectrum is estimated using a short-time Fourier transform (STFT), (ii) a score is computed for each pitch candidate within a predefined range by computing an integral transform (IT) over the spectrum, and (iii) the candidate with the highest score is selected as the estimated pitch. The algorithms w ill be presented in an order that is convenient for our purposes, but does not necessarily corre spond to the chronological order in which they were developed. 2.1 Harmonic Product Spectrum (HPS) The first algorithm to be presented is Ha rmonic Product Spectrum (HPS) (Schroeder, 1968). This algorithm estimates the pitch as the frequency that maximizes the product of the spectrum at harmonics of that frequency, i.e. as | ) ( | max arg1n k fkf X p, (2-1) where X is the estimated sp ectrum of the signal, n is the number of harmonics to be used (typically between 5 and 11), and p is the estimated pitch. The purpose of limiting the number of harmonics to n is to reduce the computati onal cost, but there is no lo gical reason behind this limit; it is hard to believe that the n -th harmonic is useful for pitch estimation, but not the n +1-th. 1 Since all the pitch estimators presented here use the magnitude of the spectrum but not its phase, the words “magnitude of” will be omitted, and the word spectrum should be interpreted as magnitude of the spectrum unless explicitly noted otherwise.

PAGE 33

33 Figure 2-2. Harmonic product spectrum. Object 2-1. Bandpass filt ered /u/ (WAV file 6 KB) Since the logarithm is an increasing function, an equivalent approach is to estimate the pitch as the frequency that maximizes the logari thm of the product of the spectrum at harmonics of that frequency. Since the loga rithm of a product is equal to th e sum of the logarithms of the terms, HPS can also be written as | ) ( | log max arg1n k fkf X p, (2-2) or using an integral transform, as 0 1' ) ( )| ( | log max arg df kf f f X pn k f. (2-3) Figure 2-2 shows the kernel of this integral for a pitch candi date with frequency 190 Hz. A pitfall of this algorithm is that if any of the harmonics is missing (i .e., its energy is zero), the product will be zero (equivalently, the sum of the logarithms will be minus infinity) for the candidate corresponding to the pitch, and theref ore the pitch will not be recognized. Figure 2-2

PAGE 34

34 also shows the spectrum of the vowel /u/ (as in good ) with a pitch of 190 Hz (Object 2-1). This sample was passed through a filter with a bandpass range of 300 3400 Hz to simulate telephonequality speech. Therefore, the f undamental is missing and HPS is not able to recognize the pitch of this signal. Another salient characteristic of this sample is its intense second harmonic at 380 Hz, caused probably by the first formant of th e vowel, which is on average around 380 Hz as well (Huang, Acero, and Hon, 2001). 2.2 Sub-harmonic Summation (SHS) An algorithm that has no problem with mi ssing harmonics is Sub-Harmonic Summation (SHS) (Hermes, 1988), which solves the problem by using addition instead of multiplication. Therefore, if any harmonic is missing, it will no t contribute to the total, but will not bring the sum to zero either. In mathematical terms, SHS estimates the pitch as | ) ( | max arg1n k fkf X p, (2-4) Figure 2-3. Subharmonic summation.

PAGE 35

35 or using an integral transform as 0 1' ) ( )| ( | max arg df kf f f X pn k f. (2-5) An example of the kernel of this integral is shown in Figure 2-3. A pitfall of this algorithm is that since it gives the same weight to all the harmonics, subharmonics of the pitch may have the same scor e as the pitch, and therefore they are valid candidates for being recognized as the pitch. Fo r example, suppose that a signal has a spectrum consisting of only one component at f Hz. By definition, the pitch of the signal is f Hz as well. However, since the algorithm adds the spectrum at n multiples of the candidate, each of the subharmonics f /2, f /3,…, f / n will have the same score as f and therefore they are equally valid to be recognized as the pitch. Figure 2-4. Subharmonic summation with decay.

PAGE 36

36 This problem can be solved by introducing a monotonically decaying weighting factor for the harmonics. SHS implements this idea by weighting the harmonics with a geometric progression as 0 1 1' ) ( | ) ( | max arg df kf f r f X pn k k f, (2-6) where the value of r was empirically set to 0.84 based on experiments using speech. The kernel of this integral is shown in Figure 2-4. SHS is the only algorithm in this chapter that solves the subharmonic problem by applying this decay fact or. Later, another algorithm will be presented (Biased Autocorrelation) which solves this problem in a different way. 2.3 Subharmonic to Harmonic Ratio (SHR) A drawback of the algorithms presented so fa r is that they examine the spectrum only at the harmonics of the fundamental, ignoring the c ontents of the spectrum everywhere else. An example will illustrate why this is a problem. Suppos e that the input signal is white noise (i.e., a signal with a flat spectrum). This signal is perceived as having no pitch. However, the previous algorithms will produce the same score for each pitch candidate, making each of them a valid estimate for the pitch. This problem is solved by the Subharmoni c to Harmonic Ratio algorithm (SHR) (Sun, 2000), which not only adds the spectrum at harmonics of the pitch candidate, but also subtracts the spectrum at the middle points between ha rmonics. However, this algorithm uses the logarithm of the spectrum, and therefore has th e problem previously discussed for HPS. Also, this algorithm gives the same weight to all the harmonics and therefore it suffers from the subharmonics problem. SHR can be written as 0 1' ) ) 2 / 1 ( ( ) ( )| ( | log max arg df f k f kf f f X pn k f (2-7)

PAGE 37

37 Figure 2-5. Subharmonic to harmonic ratio. The kernel of the integral is shown in Figure 2-5. Notice that SHR will produce a positive score for a signal with a harmonic spectrum and a scor e of zero for white noise. However, this algorithm has a problem that is shared by all the algorithms presented so far: since they examine the spectrum only at harmonic locat ions, they cannot recognize the pitch of inharmonic signals. Before we move on to the next algorithm, we wish to add some insight to SHR. If we further divide the sum in Equation 2-7 by n the algorithm would compute the average peak-tovalley ratio, where the peaks are ex pected to be at the harmonics of the candidate, and the valleys are expected to be at the middle point between harmonics. This idea will be exploited later by SWIPE, albeit with some refinements: the averag e will be weighted, the ratio will be replaced with the distance, and the peaks and valleys w ill be examined over wider and blurred regions. 2.4 Harmonic Sieve (HS) One algorithm that is able to recognize the pitch of some inharmonic signals is the Harmonic Sieve (HS) (Duifhuis and Willems, 1982). This algorithm is similar to SHS, but has

PAGE 38

38 two key differences: instead of using pulses it us es rectangles, and instead of computing the inner product between the spectrum and the rectangles, it counts the number of r ectangles that contain at least one component (a rectangle is said to contain a component if th e component fits within the rectangle and its amplitude exceeds a certain threshold T ). The rectangles are centered at the harmonics of the pitch candidates, and their width is 8% of the frequency of the harmonics. This algorithm can be expressed mathematically as n k kf kf f ff X T p1 ) 04 1 96 0 ( '| ) ( | max max arg, (2-8) where [ ] is the Iverson bracket (i.e., produces a value of one if th e bracketed proposition is true, and zero otherwise). Notice that the expressi on in the sum is a non-linear function of the spectrum, and therefore this algorithm cannot be written using an integral transform. Figure 2-6 shows the kernel used by this algorithm. Figure 2-6. Harmonic sieve.

PAGE 39

39 A pitfall of HS is that, when a component is close to an edge of a rectangle, a slight change in its frequency could put it in or out of the rectangle, possibly cha nging the estimated pitch drastically. Such radical changes do not typically occur in pitch perception, where small changes in the frequency of the components lead to small changes in the perceived pitch, as mentioned in Section 1.2.6. This problem can be solved by using smoother boundaries to decide whether a component should be considered as a harmoni c or not, as done by the next algorithm. 2.5 Autocorrelation (AC) One of the most popular methods for p itch estimation is au tocorrelation. The autocorrelation function r ( t ) of a signal x ( t ) measures the correlation of the signal with itself after a lag of size t i.e., 2 / 2 /' ) ( ') ( 1 lim ) (T T Tdt t t x t x T t r. (2-9) The Wiener-Khinchin theorem shows that autocorre lation can also be computed as the inverse Fourier cosine transform of the squa red spectrum of the signal, i.e., as 0 2) 2 cos( | ) ( | ) ( df ft f X t r. (2-10) The autocorrelation-based pitch estimation algorithm (AC) estimates the pitch as the frequency whose inverse maximizes the autocorrela tion function of the signal, i.e., as 0 2' ) / 2 cos( | ) ( | max argmaxdf f f f X pf f, (2-11) where the parameter fmax is introduced to avoid the maximum that the integral has at infinity. The kernel for this integral is shown in Fi gure 2-7. It is easy to see that as f increases, the kernel stretches without limit, and sinc e the cosine starts with a va lue of one and decays smoothly, eventually it will give a weight of one to the whole spectrum, producing a maximum at infinity.

PAGE 40

40 Figure 2-7. Autocorrelation. Notice that this problem can be easily solved by removing the first quarter of the first cycle of the cosine (i.e., setting it to zero). Since the DC of a signal (i.e., it zero-frequency component) only adds a constant to the signa l, ignoring the DC should not a ffect the pitch estimation of a periodic signal. Because of the frequency domain representation of autocorrelation, we can see that there is a large resemblance between AC and SHR (compare the kernel of Figure 2.7 with the kernel of Figure 2.5), although with three main differences. First, instead of using an alternating sequence of pulses, AC uses a cosine, which adds a smoot h interpolation between the pulses. Second, AC adds an extra lobe at DC, which was already shown to have a negative effect. Third, AC uses the power of the spectrum (i.e., the squared spectrum ) instead of the logarithm of the spectrum. Therefore, both algorithms measure the averag e peak-to-valley distance, one in the power domain and the other in the logarithmic domai n, although AC does it in a much smoother way.

PAGE 41

41 There is also a similarity between AC and HS (compare the kernel of Figure 2.7 with the kernel of Figure 2.6). HS allows for inharm onicity of the compone nts of the signal by considering as harmonic any component within a certain distance from a harmonic of the candidate pitch. AC does the same in a smoothe r way by assigning to a component a weight that is a function of its distance to the closer harmonic of the candidate pitch; the smaller the distance, the larger the weight, and the further the distance, the smaller the weight. In fact, if the component is too far from any harmoni c, its weight can be negative. Like all the algorithms presented so far, excep t SHS, AC exhibits th e subharmonics problem caused by the equal weight given to all the harmoni cs (see Section 2.2). To solve this problem, it is common to take the local maximum of highest frequency rather than the global maximum. However, this technique sometimes fails. For example, consider a si gnal with fundamental frequency 200 Hz (i.e., period of 5 ms) and first four harmonics with amplitudes 1,6,1,1, as shown in Figure 2-8A (Object 2-2). Except at very low intensity levels, the four components are audible, and the pitch of the si gnal corresponds to its fundamental frequency. However, as shown in Figure 2-8C, AC has its firs t non-zero local maximum at 2.5 ms which corresponds to a pitch of 400 Hz. Another common solution is to use the biased autocorrelation (BAC) (Sondhi, 1968; Rabiner, 1977), which introduces a factor that pe nalizes the selection of low pitch values. This factor gives a weight of one to a pitch period of zero and decays linearly to zero for a pitch period corresponding to the window size T This can be written as 0 2 ) / 1 (' ) / 2 cos( | ) ( | 1 1 max argmaxdf f f f X Tf pf T f. (2-12)

PAGE 42

42 Figure 2-8. Comparison between AC, BAC, ASDF, and AMDF. A) Spectrum of a signal with pitch and fundamental frequency of 200 Hz B) Waveform of the signal with a fundamental period of 5 msec. C) AC ha s a maximum at every multiple of 5 ms, making it hard to choose the best candidate. The first (non-zero) local maxima is at 2.5 ms, making the “first peak ” criteria to fail. D) BAC has its first peak and its nonzero largest local maximum at 2.5 ms. E) ASDF is an inverted, shifted, ands scaled AC. F) AMDF is similar to ASDF. Object 2-2. Signal with strong second harmonic (WAV file, 32 KB) However, the combination of this bias and the squaring of the spectrum may introduce new problems. For example, if T = 20 ms as in the BAC function of Figure 2-8D, the bias will make the height of the peak at 2.5 ms larger than th e height of the peak at 5 ms, consequently causing an incorrect pitch estimate.

PAGE 43

43 2.6 Average Magnitude and Squared Difference Functions (AMDF, ASDF) Two functions similar to autocorrelation (in the sense that they compare the signal with itself after a lag of size t ) are the magnitude difference f unction (AMDF) and the average squared difference function (ASDF) The AMDF is defined as 2 / 2 /' )| ( ) ( | 1 ) (T Tdt t t x t x T t d, (2-13) and the ASDF as 2 / 2 / 2' )] ( ) ( [ 1 ) (T Tdt t t x t x T t s. (2-14) It is easy to show that ASDF and autocorrelation are related through the equation (Ross, 1974) ) ( ) 0 ( 2 ) ( t r r t s (2-15) and therefore, s ( t ) is just an inverted, shifte d, and scaled version of autocorrelation. Therefore, as illustrated in the panels C (or D) and E of Figure 2-8, where (biased) autocorrelation has peaks, s ( t ) has dips. Thus, an ASDF-based algorithm mu st look for minima instead of maxima to estimate pitch. It has also been shown (Ross, 1974) that d ( t ) can be approximated as )] ( [ ) ( ) (2 / 1t s t t d (2-16) Although the relation between d ( t ) and s ( t ) depends on t through ( t ), it is found in practice that this factor does not play a significant role, and a large similarity between d ( t ) and s ( t ) exists, as observed in panels E and F of Figure 2-8. Therefore, since the functions r ( t ), s ( t ), and d ( t ) are so strongly related, none of them is expected to offer much more than the others for pitch estimation. However, modifications to these functions, which cannot be expressed in terms of the other functions, have been used successfully to improve their performance on pitch estimation.

PAGE 44

44 An example is given by YIN (de Chevei gne, 2002), which uses a variation of s ( t ) to avoid the dip at lag zero, improving its performance. Another va riation is the one we proposed in the previous section (i.e., the removal of the first quarter of the cosine) to avoid the maximum at zero lag for autocorrelation. 2.7 Cepstrum (CEP) An algorithm similar to AC is the cepstrum -based pitch estimation algorithm (CEP) (Noll, 1967). The cepstrum c ( t ) of a signal x ( t ) is very similar to its auto correlation. The only difference is that it uses the logarithm of the spectrum instead of its square, i.e., 0) 2 cos( | ) ( | log ) ( df ft f X t c. (2-17) CEP estimates the pitch as the frequency whose inverse maximizes the cepstrum of the signal, i.e., as Figure 2-9. Cepstrum.

PAGE 45

45 0' ) / 2 cos( | ) ( | log max argmaxdf f f f X pf f. (2-18) The kernel of this integral is shown in Figur e 2-9. Like AC, CEP e xhibits the subharmonics problem and the problem of having a maximum at a large value of f The maximum is not necessarily at infinity because, depending on the scaling of the signal, the logarithm of the spectrum may be negative at larg e frequencies, and therefore assi gning a positive weight to that region may in fact decrease the score. Figure 210 shows the spectrum of the speech signal that has been used in previous figures and the ke rnel that produces the highest score for that spectrum, which corresponds to a candidate pitch of about 10 kHz. Notice that the logarithm of the spectrum was arbitrarily set to zero for fre quencies below 300 Hz because its original value (minus infinity) would make unfeasible the eval uation of the integral in Equation 2-18. This problem of the use of the logarithm when ther e are missing harmonics wa s already discussed in Section 2.1. Figure 2-10. Problem caused to cepstrum by cosine lobe at DC.

PAGE 46

46 2.8 Summary In this chapter we presented pitch estimation algorithms that have influenced the creation of SWIPE. The most common problem s found in these algorithms were the inability to deal with missing harmonics (HPS, SHR, and CEP) and inha rmonic signals (HPS, SHS, and SHR), and the tendency to produce high scores for subharmonics of the pitch (all the algorithms, although to a lesser extent SHS and BAC). Solutions to these problems were either found in other algorithms or were proposed by us.

PAGE 47

47 CHAPTER 3 THE SAWTOOTH WAVEFORM IN SPIRED PITCH ESTIMATOR Aiming to improve upon the algorithms presente d in Chapter 2, we propose the Sawtooth Waveform Inspired Pitch Estimator (SWIPE)2. The seed of SWIPE is the implicit idea of the algorithms presented in Chapter 2: to find the frequency that maximizes the average peak-tovalley distance at harmonics of that frequency. Ho wever, this idea will be implemented trying to avoid the problem-causing features found in those algorithms. This will be achieved by avoiding the use of the logarithm of the spectrum, a pplying a monotonically decaying weight to the harmonics, observing the spectrum in the ne ighborhood of the harmonics and middle points between harmonics, and using smooth weighting functions. 3.1 Initial Approach: Average Peak -to-Valley Distance Measurement If a signal is periodic w ith fundamental frequency f its spectrum must contain peaks at multiples of f and valleys in between. Since each peak is surrounded by two valleys, the average peak-to-valley distance (APVD) for the k -th peak is defined as | ) ) 2 / 1 (( | | ) ( | 2 1 | ) ) 2 / 1 (( | | ) ( | 2 1 ) ( f k X kf X f k X kf X f dk | ) ) 2 / 1 (( | | ) ) 2 / 1 (( | 2 1 | ) ( | f k X f k X kf X (3-1) Averaging over the first n peaks, the global APVD is n k k nf d n f D1) ( 1 ) ( | ) ) 2 / 1 (( | | ) ( | | ) ) 2 / 1 (( | 2 1 | ) 2 / ( | 2 1 11 n kf k X kf X f n X f X n (3-2) 2 The name of the algorithm will become clear in a posterior section.

PAGE 48

48 Figure 3-1. Average-peak-t o-valley-distance kernel. Our first approach to estimate pitch is to find the frequency that maximizes the APVD. Staying with the integral transform notation used in Chapter two, and dropping the unnecessary 1/ n term, the algorithm can be expressed as 0' ') ( | ) ( | max argmaxdf f f K f X pn f f, (3-3) where n k nf f k f kf f f n f f f f K1) / ) 2 / 1 (( ) / ( ) / ) 2 / 1 (( 2 1 ) 2 / ( 2 1 ) ( (3-4) The kernel Kn ( f f ) for f = 190 Hz is shown in Figure 3-1 t ogether with the spectrum of the sample vowel / u / used in Chapter 2, which will be used extensively in this chapter as well. The kernel is a function not only of the frequencies but also of n the number of harmonics to be used. Each positive pulse in the kernel has a weight of 1, each negative pulse between positive pulses has a weight of -1, and the first and last negativ e pulses have a weight of -1/2. This kernel is

PAGE 49

49 similar to the kernel used by SHR (see Chapter 2), with the only difference that in Kn the first negative pulse has a weight of -1/2 and Kn has an extra negative pulse at the end, also with a weight of -1/2. 3.2 Blurring of the Harmonics The previous method of measuring the APVD wo rks if the signal is harmonic, but not if it is inharmonic. To allow for inharmonicity, our first approach was to blur the location of the harmonics by replacing each pulse with a triangle function with base f /2, otherwise. 0 4 / | | if | | 4 / ) ( f f f f ff (3-5) The base of the triangle was set to f /2 to produce a triangular wave as shown in Figure 3-2. To be consistent with the APVD measure, the first and last negative triangles we re given a height of 1/2. One reason for using a base th at is proportional to the candidate pitch is that it allows for a pitch-independent handling of inha rmonicity, as seems to be done in the auditory system (see section 1.2.6). Figure 3-2. Triangul ar wave kernel.

PAGE 50

50 Figure 3-3. Necessity of strictly convex kernels. Object 3-1. Beating tones (WAV file, 32 KB) The triangular kernel approach was abandone d because it was found that the components of the kernel must be strictly concave (i.e., must have a continuous second derivative) at their maxima. The following example will illustrate why this is nece ssary. Suppose we have a signal with components at 200 and 220 Hz, as shown in Figure 3-3 (Object 3-1). This signal is perceived as a loudness-varying tone with a pitch of 210 Hz, phenomena known as beating However, the triangular kernel produces the sa me score for each candidate between 200 and 220 Hz. This is easy to see by slightly stretching or compressing the kernel such that its first positive peak is within that range. Such stretching or compression would cau se an increment on the weight of one of the components and a decremen t of the same amount on the other, keeping the score constant. Therefore, the triangle was discarded and concatenations of truncated squarings, Gaussians, and cosines were explored. The squari ng function was truncated at its fixed point, and

PAGE 51

51 Figure 3-4. Kernels formed from concatenations of truncated s quarings, Gaussians, and cosines. the Gaussian and the cosine func tions were truncated at their in flection points. The Gaussian was truncated at the inflection points to ensure that the concatenation of positive and negative Gaussians have a continuous second derivative. The same can be said about the cosine, but furthermore, the concatenation of positive and negative cosine lobes produces a cosine, which has all order derivatives. Concatenations of these three functions, st retched or compressed to form the desired pattern of maxima at multiples of the candidate pitch, are illustrated in Figure 3-4. Although informal tests showed no significant differenc es in pitch estimation performance among the three, the cosine was preferred because of its si mplicity. Notice also that this kernel is the one used by the AC and CEP pitch estimators (see Chapter 2). 3.3 Warping of the Spectrum As mentioned in Chapter 2, the use of the logarithm of the spectrum in an integral transform is inconvenient because there may be regions of the spectrum with no energy, which

PAGE 52

52 Figure 3-5. Warping of the spectrum. would prevent the evaluation of the integral, sin ce the logarithm of zero is minus infinity. But even if there is some small energy in those re gions, the large absolute value of the logarithm could make the effect of these low energy regions on the integral larger than the effect of the regions with the most energy, wh ich is certainly inconvenient. To avoid this situation, the use of the loga rithm of the spectrum was discarded and other commonly used functions were explored: square identity, and square-root. Figure 3-5 shows how these functions warp th e spectrum of the vowel / u / used in Chapter 2. As mentioned earlier, this spectrum has two particularities: it has a missing fundamental, and it has a salient second

PAGE 53

53 harmonic. The missing fundamental is evident in panel B, which s hows that the logarithm of the spectrum in the region of 190 Hz is minus infini ty. The salient second harmonic at 380 Hz shows up clearly in the other three panels, but especia lly in panel C, where the spectrum has been squared. Panel D shows the square-root of th e spectrum, which neit her overemphasizes the missing fundamental (as the logarithm does) nor the salient second harmonic (as the square does). We believe the square-root warping of the sp ectrum is more convenient for three reasons. First, it matches better the response of the aud itory system to amplitude, which is close to a power function with an exponent in the range 0.4-0.6 (see Chapter 2); second, it allows for a weighting of the harmonics proportional to their amp litude, as we will show in the next section; and third, it produces better pitch esti mates, as found tests presented later. 3.4 Weighting of the Harmonics To avoid the subharmonics problem presented in Chapter 2, a decaying weighting factor was applied to the harmonics. The types of decays explored were exponential and harmonic. For exponential decays, a weight of r k 1 was applied to the k -th harmonic ( k = 1, 2, …, n and r = 0.9, 0.7, 0.5) through the multiplication of the kernel by the envelope r f / f -1, as shown in Figure 3-6. For harmonic decays, a weight of 1/ k p was applied to the k -th harmonic ( k = 1, 2, …, n and p = 1/2, 1, 2) through the multiplication of the kernel by the envelope ( f / f ) p, as shown in Figure 3-6. In in formal tests, the best results were obtained using harmonic decays with p = 1/2, which matches the decay of the square-root of the average spectrum of vowels (see Chapter 2). In other wo rds, better pitch estimates were obtained when computing the inner product (IP) of the square-r oot of the input spectrum and th e square-root of the expected spectrum, than when computing the IP’s over the raw spectra.

PAGE 54

54 Figure 3-6. Weighting of the harmonics. One explanation for this is that when the input spectrum matches its corresponding template (i.e., the expected spectrum for that pitc h), the use of the square-root of the spectra in the IP gives to each harmonic a weight proportiona l to its amplitude. For example, if the input spectrum has the expected shape for a vowel, i.e., th e amplitude of the harmonics decay as 1, 1/2, 1/3, etc., then their square root decays as 1, 1/ 2, 1/ 3, etc. Since the terms in the sum of the IP are the squares of these values (i.e., 1, 1/2, 1/ 3, etc.), then the rela tive contribution of each harmonic is proportional to its amplitude. Conversely, if we compute the IP over the raw spectra, the terms of the sum will be 1, 1/4, 1/9, etc., which are not proportional to the amplitude of the components, but to their square. This would make the contribution of the strongest harmonics too large and the contribution of the weakest too sma ll. The situation would be even worse if we would compute the IP over the energy of the spect rum (i.e., its square). The expected energy of the harmonics for a vowel follows the pattern 1, 1/4, 1/16, etc., and computing the IP of the

PAGE 55

55 energy of the harmonics with itsel f produces the terms 1, 1/16, 1/ 256, etc, which gives too much weight to the first harmonic and almo st no weight to the other harmonics. In the ideal case in wh ich there is a perfect match between the input and the template, any of the previous types of warping would produ ce the same result: a no rmalized inner product (NIP) equal to 1. However, the likelihood of a perfect match is lo w, and the warping may play a big role in the determination of the best match, as we found in informal tests, which show that the use of the square-root of the spec trum produces better pitch estimates. 3.5 Number of Harmonics An important issue is the number of harmonics to be used to analyze the pitch. HPS, SHS, SHR, and HS use a fixed finite number of ha rmonics, and CEP and AC use all the available harmonics (i.e., as many as the sampling frequency allows). In informal tests the best results were obtained when using as many harmonics as available, although it was found that going beyond 3.3 kHz for speech and 5 kHz for musica l instruments did not improve the results significantly. Thus, to reduce computational co st it is reasonable to set these limits. 3.6 Warping of the Frequency Scale As mentioned in Section 3.4, if the input matc hes perfectly any of th e templates, their NIP will be equal to 1, regardless of the type of warping used on the spectrum. The same applies to the frequency scale. However, since a perfect matc h will rarely occur, a warping of the frequency scale may play a role in determining the best match. For the purposes of computing the integral of a function, we can think of a warping of the scale as the process of sampling the function more finely in some regions than others, effectively giving more emphasis to the more finely sample d regions. In our case, since we are computing an inner product to estimate pitch, it makes sens e to sample the spectrum more finely in the region that contributes the most to the determinati on of pitch. It seems reasonable to assume that

PAGE 56

56 this region is the one with the most harmonic energy. In the case of speech, and assuming that the amplitude of the harmonics decays inversel y proportional to frequency, it seems reasonable to sample the spectrum more finely in the neighborhood of the fundamental and decrease the granularity as we move up in fr equency, following the expected 1/ f pattern for the amplitude of the harmonics. A decrease in gr anularity should also be pe rformed below the fundamental because no harmonic energy is expected below it. However, the determination of the frequency at which this decrease should be gin is non-trivial, since we do not know a-priori the fundamental frequency of the incoming sound (that is pr ecisely what we wish to determine). As we did for the selection of the warping of the amplitude of the spectrum, we appeal to the auditory system and borrow the frequenc y scale it seems to use: the ERB scale (see Chapter 1). Therefore, to compute the similarity between the input spectrum and the template, we sample both of them uniformly in the ERB s cale, whose formula is given in Equation 1-8. This scale has several of the characteristics we desire (see Figure 1-9): it has a logarithmic behavior as f increases, tends toward a constant as f decreases, and the fr equency at which the transition occurs (229 Hz) is close to the mean fundamental frequency of speech, at least for females (Bagshaw, 1994; Wang and Lin, 2004; Sc hwartz and Purves, 2004). It does not produce a decrease of granularity as f approaches zero, but at least doe s not increase without bound either, as a pure logarithmic scale does. The convenience of the use of the ERB scal e for pitch estimation over the Hertz and logarithmic scales was confirmed in informal test s, since better results we re obtained when using the ERB scale. Two other common psychoacoustic scales, the Mel and Bark scales, were also explored, but they produced worse results than the ERB scale.

PAGE 57

57 3.7 Window Type and Size Along this chapter we have been mentioning our wish to obtain a perf ect match (i.e., NIP equal to 1) between the input spectrum and the template corresponding to the pitch of the input. This section deals with the feasib ility of achieving such goal. First of all, since the input is non-negative but the template has negative regions, a perfect match is impossible. One solution would be to set the negative part of the template to zero, but this would leave us without the useful property that the negative weights have: the production of low scores for noisy signals (see Section 2.3). Instead, the solution we adopt is to preserve the negative weights, but ignore them when computin g the norm of the template. In other words, we normalize the kernel using only the norm of its positive part )) ( 0 max( ) ( f K f K (3-6) Hereafter, we will refer to this normalization as K+-normalization. To obtain a K+-normalized inner product ( K+-NIP) close to 1, we must direct our efforts to make the shape of the spectral peaks match the sh ape of the positive cosine lobe used as base element of the template, and also to force the te mplate have a value of zero in the negative part of the cosine. Since the shape of the spectral p eaks is the same for all peaks, it is enough to concentrate our efforts on one of them, and for simplicity we will do it for the peak at zero frequency. The shape of the spectral peaks is determined by the type of window used to examine the signal. The most straightforward window is th e rectangular window, whic h literally acts like a window: it allows seeing the signal inside the window but not outside it. More formally, the rectangular window multiplies the signal by a rectangular function of the form

PAGE 58

58 Figure 3-7. Fourier transf orm of rectangular window. otherwise, 0 2 / | | if / 1 ) ( T t T tT (3-7) where T is the window size. If a rectangular window is used to extr act a segment of a sinusoid of frequency f Hz to compute its Fourier transform, the support of this transform will not be concentrated at a single point but will be smeared in the neighborhood of f This effect is s hown in Figure 3-7 for f = 0, in other words, the figure show s the Fourier transform of T ( t ). This transform can be written as sinc( Tf ), where the sinc function is defined as ) sin( ) ( sinc (3-8) This function consists of a main lobe centered at zero and small side lobes that extend towards both sides of zero. For any other value of f its Fourier transform is just a shifted version of this function, centered at f

PAGE 59

59 Figure 3-8. Cosine lobe and square-root of the spect rum of rectangular window. Since the height of the side l obes is small compared to the height of the main lobe, the most obvious approach to try to maximize the ma tch between the input and the template is to match the width of the main lobe, 2/ T to the width of the cosine lobe, f /2, and solve for the free variable T This produces an “optimal” window size, hereafter denoted T*, equal to T = 4/ f Figure 3-8 shows the square-r oot of the spectrum of a rectangular window of size T = T* = 4/ f and a cosine with period f (i.e., the template used to recognize a pitch of f Hz). The K+-NIP of the main lobe of the spectrum and the cosine positive lobe (i.e., from f /4 to f /4) sampled at 128 equidistant points is 0.9925, which seem s satisfactorily high. However, the K+-NIP computed over the whole period of the cosine (i.e., from – f /2 to f /2) sampled at 128 equidistant points is only 0.5236, which is not very high. This low K+-NIP is caused by the rela tively large side lobes, which reach a height of almost 0.5.

PAGE 60

60 A window with much smaller side lobes is the Hann window. The shorter side lobes are achieved by attenuating the time-domain window down towards zero at the edges3. The formula for this window is T t T t hT2 cos 1 1 ) (, (3-9) where T is the window size (i.e., the size of its suppor t). This window is simply one period of a raised cosine centered at zero, as illustrated in Figure 3-9. The Fourier transform of a Hann window of size T is ) 1 ( sinc 2 1 ) 1 ( sinc 2 1 ) ( sinc ) ( Tf Tf Tf f HT, (3-10) a sum of three sinc functions, as illustrated in Figure 310. The width of the main lobe of this transform is 4/ T twice as large as the main lobe of the spectrum of the rectangular window. Figure 3-9. Hann window. 3 This time-frequency relation may not be obvious at fi rst sight, but it can be shown using Fourier analysis.

PAGE 61

61 Figure 3-10. Fourier transform of the Hann wi ndow. The FT of the Hann window consists of a sum of three sinc functions. Equalizing this width to the width of the cosine lobe, f /2, and solving for T we obtain an optimal window size of T* = 8/ f Figure 3-11. Cosine lobe and square -root of the spectrum of Hann window.

PAGE 62

62 Figure 3-11 shows the square-root of th e spectrum of a Hann window of size T = T* = 8/ f and a cosine with period f The similarity between the main lobe and the positive lobe of the cosine is remarkable. Using Equations 3-8 and 310 it can be shown that th ey match at 5 points: 0, +/f /8, and +/f /4, with values cos(0) = 1, cos( /4) = 1/ 2, and cos( /2) = 0, respectively. The K+-NIP of the main lobe of the spectrum and th e positive part of the cosine sampled at 128 equidistant points is 0.9996, and the K+-NIP computed over the whole period of the cosine sampled at 256 equidistant poin ts is 0.8896, much larger th an the one obtained with the rectangular window. The same approach can be used to obtain the optimal window size for other window types. For the most common window types used in signal processing, it can be shown that the width of the main lobe is 2 k / T where the parameter k depends on the window type (see Oppenheim, Schafer, and Buck, 1999) and is tabulated in Ta ble 3-1. For these windows, the optimal window Table 3-1. Common windows used in signal processing* K+-NIP Window type k Positive lobe Whole period Bartlett 2 0.99840.7959 Bartlett-Hann 2 0.99950.8820 Blackman 3 0.98990.9570 Blackman-Harris 4 0.97380.9689 Bohman 3 0.99260.9474 Flat top 5 0.98960.9726 Gauss 3.14 0.96330.8744 Hamming 2 0.99930.9265 Hann 2 0.99960.8896 Nuttall 4 0.97180.9682 Parzen 4 0.96270.9257 Rectangular 1 0.99250.5236 Triangular 2 0.99800.8820 The K+-NIP values were computed using 128 equidistant samples for the positive lobe and 256 equidistant samples for the whole period.

PAGE 63

63 size to analyze a signal with pitch f Hz can be obtaine d by equalizing 2 k / T to the width of the cosine lobe, f /2, to produce T* = T = 4 k / f Table 3-1 also shows the K+-NIPs between the square-root of the spectrum and the cosine computed over the positive lobe of the cosine (from f /4 to f /4) and over the whole period of the cosine (from f /2 to f /2). The window that produces the largest K+-NIP over the whole period is the flat-top window. However, its si ze is so large compared to othe r windows that the increase in K+-NIP is probably not worth the increase in computational cos t; similar results are obtained with the Blackman-Harris window, wh ich is 4/5 its size. If computa tional cost is a serious issue, a good compromise is offered by the Hamming window, which requires half the size of the Blackman-Harris window, and produces a K+-NIP of about 0.93. This K+-NIP is larger than the one produced by the Hann window, with no increased computational cost ( k =2 in both cases). However, since the difference in performance between them is not large, we prefer the analytically simpler Hann window. 3.8 SWIPE Putting all the previous sections together, the SWIPE estimate of the pitch at time t can be formulated as 2 / 1 ) ( ERBs 0 2 / 1 ) ( ERBs 0 2 ) ( ERBs 0 2 / 1 2 / 1max max max))| ( ( | ))] ( ( [ ) ( 1 | )) ( ( | )) ( ( ) ( 1 max arg ) ( f f f fd f t X d f K d f t X f K t p (3-11) where otherwise, 0 3/4, ) ( 1/4 ) ( or 3/4 1/4 if ) / 2 cos( 2 1 1/4, ) ( 3/4 if ) / 2 cos( ) ( f n /f f f n /f f f f f n /f f f f f f K (3-12)

PAGE 64

64 ') ( ) ( ) (' 2 / 4dt e t x t t w f f t Xf j f k, (3-13) is frequency in ERBs, ( ) converts frequency from ERBs into Hertz, ERBs( ) converts frequency from Hertz into ERBs, K+( ) is the positive part of K ( ) {i.e., max[0, K ( )]}, fmax is the maximum frequency to be used (typically the Nyquist frequency, although 5 kHz is enough for most applications), n( f ) = fmax / f 3/4 and w4 k / f ( t ) is one of the window functions in Table 3-1, with size 4 k / f The kernel corresponding to a ca ndidate with frequency 190 Hz is shown in Figure 3-12. Panel A shows the kernel in the Hertz scale and Pane l B in the ERB scale, the scale used to compute the integral. Although the initial approach of measuring a smooth average peak to valley distance has been used everywhere in this chapter, we can ma ke a more precise descri ption of the algorithm. Figure 3-12. SWIPE kernel. A) The SWIPE kern el consists of a cosine that decays as 1/ f with a truncated DC lobe and halved first and last negative lobes. B) SWIPE kernel in the ERB scale.

PAGE 65

65 It can be described as the computation of the si milarity between the square-root of the spectrum of the signal and the square-r oot of the spectrum of a sa wtooth waveform, using a pitchdependant optimal window size. This descriptio n gave rise to the na me Sawtooth-Waveform Inspired Pitch Estimator (SWIPE). 3.9 SWIPE So far in this chapter we have concentrat ed our efforts on maximizing the similarity between the input and the desired template, but we have not done anything explicitly to reduce the similarity between the input and the other temp lates, which will be the goal of this section. The first fact we want to mention is that most of the mistakes that pitch estimators make, including SWIPE, are not random: they consist of estimations of the pitch as multiples or submultiples of the pitch. Therefore, a good source of error to attack is the score (pitch strength) of these candidates. A good feature to reduce supraharmonic errors is to use negative weights between harmonics. When analyzing a pitch candidate, if th ere is energy between a ny pair of consecutive harmonics of the candidate, this suggests that the p itch, if any, is a lower candidate. This idea is implemented by the negative weight s, which reduce the score of th e candidate if there is any energy between its harmonics. This feature is used by algorithms like SHR, AC, CEP, and SWIPE. The effect of negative weights on supraharmonics of the pitch is illustrated in Figure 3-13A. It shows the spectrum of a si gnal with fundamental at 100 Hz and all its harmonics at the same amplitude (vertical lines) (Only harmonics up to 1 kHz are shown, but the signal contains harmonics up to 5 kHz.) The components are shown as lines to facilitate visualization, but in general th ey will be wider, with a width that depends on the window size.

PAGE 66

66 Figure 3-13. Most common pitch estimati on errors. A) Harmonic signal with 100 Hz fundamental frequency and all the harmoni cs at the same amplitude, and 200 Hz kernel with positive (continuous lines) and negative (dashed lines) cosine lobes. B) Same signal and 50 Hz kernel. C) Scores us ing only positive cosine lobes (exhibits peaks at sub and supraharmonics). D) Scor es using both positive and negative cosine lobes (exhibits peaks at subharmonics). E) Scores using both positive and negative cosine lobes at the first and prime harm onics (exhibits a major peaks only at the fundamental) Panel A also shows the positive cosine lobes (continuous curves) used to recognize a pitch of 200 Hz and the negative cosine lobes that reside in between (dashed curves). The positive cosine lobes at the harmonics of 200 Hz produce a positive contribution towards the score of the 200 Hz candidate, but the negative cosine lobes at the odd multiples of 100 Hz cancel out this contribution. Panel C shows the score for each pitc h candidate using as kernel only the positive

PAGE 67

67 cosine lobes, whereas Panel D shows the scores using both the positive and the negative cosine lobes. The effect on the 200 Hz peak is definite : it has disappeared. The same effect is obtained for higher order multiples of 100 Hz (not shown in the figure). To reduce subharmonic errors, two techniques we re presented in Chapter 2: the use of a decaying weighting factor for the harmonics, and th e use of a bias to pe nalize the selection of low frequency candidates. The former is used by SHS and SWIPE, and the latter by AC. Although these techniques have an effect in reducing the scor e of subharmonics, significant peaks are nevertheless present at submultiples of the pitch, as shown in Figure 3-13D. To further reduce the height of the peaks at subharmonics of the pitch we propose to remove from the kernel the lobes located at no n-prime harmonics, except the lobe at the first harmonic. Figure 3-13B helps to show the intuiti on behind this idea. This figure shows the same spectrum as in Figure 3-13A and the kernel corresponding to the 50 Hz candidate This kernel has positive lobes at each multiple of 50 Hz a nd therefore at each multiple of 100 Hz, producing a high score for the 50 Hz candidate, as shown in Pa nel D. Notice that this candidate gets all of its credit from its 2nd, 4th, 6th, etc., harmonics, i.e., 100 Hz, 200 Hz 300 Hz, etc., frequencies that suggest a fundamental frequency (and pitch) of 100 Hz. The same situation occurs with the candidate at 33 Hz (kernel not shown), but in this case its credit comes from its 3rd, 6th, 9th, etc., harmonics. If we use only the first and prime lobes of the kernel, the candidate s at subharmonics of 100 Hz would get credit only from their harmonic at 100 Hz, but not from any other. In general, it can be shown that with this approach, no candidate below 100 Hz can get credit from more than one of the harmonics of 100 Hz. In other wo rds, if there is a ma tch between one of the prime harmonics of this candida te and a harmonic of 100 Hz, no other prime harmonic of the

PAGE 68

68 candidate can match another harmonic of 100 Hz, and therefore the score of all the candidates below 100 Hz has to be low compared to the score of the 100 Hz candidate. This effect is evident in Figure 3-13E, which shows the scores of the pitch candidates when using only their first and prime harmonics. Certainly, th ere are peaks below 100 Hz, but they are relatively small compared to the peak at 100 Hz. Contrast this w ith Panels C and D, where the score of 50 Hz is relatively high, and therefore the risk of selecting this candidate is high. An extra step needs to be done to avoid bias in the scores. Remember from the beginning of this chapter that the central idea of SWIPE wa s to compute the average peak-to valley distance at harmonic locations in the spectrum. When computing this average for a single peak, the weight of the peak was twice as large as the wei ght of its valleys, as ex pressed in Equation 3-1. Since the global average is the average of this equation over all the peaks, and since each valley is associated to two peaks too, the weight of the valleys, except the first and the last ones, was the same as the weight of the peaks, as expres sed in Equation 3-2. However, if we use only the first and prime harmonics, the weight of the vall eys will not be necessarily -1, but will depend on whether the valleys are between the first or pr ime harmonics. The only valleys with a weight of -1 will be the valley between the first a nd second harmonics, and the valley between the second and third harmonics; all the other valleys will have a weight of -1 /2, before applying the decaying weighting factor, of course. This variation of SWIPE in which only the fi rst and prime harmonics are used to estimate the pitch will be denominated SWIPE (read SWIPE prime). Its kernel is defined as P i if f K f f K} 1 {) ( ) (, (3-14) where P is the set of prime numbers, and

PAGE 69

69 Figure 3-14. SWIPE kernel. Similar to the SWIPE kernel but includes only the first and prime harmonics. otherwise. 0 3/4, | | 1/4 if ) / 2 cos( 2 1 1/4, | | if ) / 2 cos( ) ( i f / f f f i f / f f f f f Ki (3-15) Notice that the SWIPE kernel can also be written as in Equa tion 3-14, by including all the harmonics in the sum. The SWIPE kernel corresponding to a pitch candidate of 190 Hz (5.6 ERBs) is shown in Figure 3-14. The numbers on top of the p eaks show the harmonic number they correspond to. 3.9.1 Pitch Strength of a Sawtooth Waveform Since the template used by SWIPE has peaks only at the first and prime harmonics, a perfect match between the template and the sp ectrum of a sawtooth waveform is impossible (unless fmax is so small relative to the pitch that the template contains no more than three

PAGE 70

70 Figure 3-15. Pitch strength of sawtooth waveform. A) 625 Hz. B) 312 Hz. C) 156 Hz. D) 78.1 Hz. harmonics). Therefore, it would be interesting to analyze the K+-NIP between the spectrum and the template as a function of the number of harmonics. Figure 3-15 shows the pitch strength ( K+NIP) obtained using SWIPE and SWIPE for different pitches an d different number of harmonics. The pitches shown are 625, 312, 156, and 78.1 Hz. They were chosen because their optimal window sizes are powers of two for the sampling rates used: 2.5, 5, 10, 20, and 40 kHz. In each case, fmax was set to the Nyquist frequency. The pitch strength estimates produced by SW IPE are larger than the ones produced by SWIPE except when the number of harmonics is less than four, in which case both algorithms use all the harmonics. The pitch strength esti mates produced by SWIPE in Figure 3-15 have a

PAGE 71

71 mean of 0.93 and a variance of 5.1 10 5. This mean is significantly larger than the K+-NIP reported in Table 3-1 for the Hann window. The reas on of the mismatch is that the granularity used to produce the data in Table 3-1 and the data in Figure 3-15 is different. The K+-NIP values in Table 3-1 are based on a sampling of 128 points per spectral lobe, while the data in Figure 315 is based on a sampling of 10 points per ERB, which depending on the pitch and the harmonic being sampled, may correspond to a range of about 0 to 40 points per spectral lobe. On the other hand, the mean of the p itch strength estimates produced by SWIPE is 0.87 and the variance is 1.0 10 3. The smaller mean is expected since the template of SWIPE includes only the first and prime harmonics, while a sawtooth waveform has energy at each of its harmonics. The larger variance is also expected since the prime numbers become sparser as they become larger, causing a reduction in the similari ty of the template and the spectrum of the sawtooth waveform as the number of harmonics increases. It would be useful to have a lower bound for the pitch strength estimates produced by SWIPE but an analytical formulation for it is intr actable. However, the data in Figure 3-15, which is representative of a wi de range of pitches and number of harmonics, suggests that the pitch strength produced by SWIPE for a sawtooth waveform does not go below 0.8. 3.10 Reducing Computational Cost 3.10.1 Reducing the Number of Fourier Transforms The computation of Fourier transforms is one of the most computationally expensive operations of SWIPE and SWIPE Therefore, to reduce computati onal cost it is important to reduce the number of Fourier transforms. There ar e two strategies to achieve this: to reduce the window overlap and to share Fourier tr ansforms among several candidates.

PAGE 72

72 3.10.1.1 Reducing window overlap The most common windows used in signal processing are the ones that are attenuated towards zero at their edges (e.g., Hann a nd Hamming windows). A disadvantage of this attenuation is that it is possible to overlook short events if these events are located at the edges of the windows. To avoid this situation, it is co mmon to use overlapping windows, which increases the coverage of the signal, at the cost of an increase in computation. However, after a certain point, overlapping windows star t to produce redundancy in the analysis, without adding any significant benefit. The goal of this section is to propose a schema obtain a good balance between signal coverage and computational cost. As mentioned in Section 1.1.4, depending on fr equency, a minimum of two to four cycles are necessary to perceive the pitch of a pure tone. Based on the similarity of the data used to arrive to this conclusion and data obtained using musical instruments, it is reasonable to assume that these results are applicable to more ge neral waveforms, in particular, to sawtooth waveforms. To avoid the interaction between the nu mber of cycles and pitch, for purposes of the algorithm, we set the minimum number of cycles necessary to determine pitch to four, the maximum among the minimum number of cycl es required over all frequencies. Since SWIPE and SWIPE are designed to produce maximu m pitch strength for a sawtooth waveform4 and zero pitch strength for a flat spectrum5, a natural choice to decide whether a sound has pitch is to use as thre shold half the pitch st rength of a sawtooth waveform. (In Section 3.9.1 it was found that the pitch strength of a sawtooth waveform is about 0.93 for SWIPE and between 0.83 and 0.93 for SWIPE .) To make these algorithms produce maximum pitch strength, 4 In fact, SWIPE produces maximum pitch strength for sawtoo th waveforms with the non-prime harmonics removed (except the first one), but we believe this type of signal is unlikely to occur in nature. 5 The pitch strength of a flat spectrum is in fact negative because of the decaying kernel envelope.

PAGE 73

73 a perfect match between the kernel and the spec trum of the signal is necessary, which requires that the window contains eigh t cycles of the sawtooth waveform, when using a Hann window. If the signal contains exactly eight cycles (i.e., if it is zero outside th e window) and is shifted slightly with respect to the window, the pitch strength decreases, and it reaches a limit of zero when the signal gets completely out of the wi ndow. Although hard to show analytically, it is easy to show numerically that that the relation be tween the shift and pitch strength is linear. Therefore, if the window contains four or mo re cycles of the sawtooth waveform, the pitch strength is at least half the maximum attainable pitch strength (i.e., the one achieved when the window is full of the sawtooth waveform), and if the window contains less than four cycles of the sawtooth waveform, the pitc h strength is less than half the maximum attainable pitch strength. Figure 3-16. Windows overlapping. Object 3-2. Four cycles of a 100 Hz sawtooth waveform (WAV file, 2 KB)

PAGE 74

74 Therefore, if we determine the existence of pitch based on a pitch strength threshold equal to half the maximum attainable pitch strength, to determine as pitched a signal consisting of four cycles of a sawtooth waveform, we need to ensu re that there exists at least one window whose coverage includes the whole signal. It is straightforward to show that to achieve this goal, we need to distribute the windows such that their separation in no larger than four cycles of the pitch period of the signal. In other words, th e windows must overlap by at least 50%. This situation is illustrated in Figure 3-16, which shows a signal cons isting of four cycles of a sawtooth waveform (listen to Object 3-2) and two Hann windows cen tered at the beginning and the end of the signal. The windows are separate d at a distance of four cycles, and the support of each of them overlaps with the whole signal, making it possible for each window to reach the pitch strength threshold. If the si gnal is slightly shifted in any di rection, one of the windows will cover less than four periods, but th e other will cover the four periods. This would not be true if the separation of the windows is larg er than four cycles. If the support of one of the windows overlaps complete ly with the signal but the separation of the windows is larger than four cycles, the other wi ndow will not cover the signal completely, and therefore a small shift of the signal towards th e latter window would not necessarily put the whole signal inside the window, making it impossible for any of the windows to produce a pitch strength larger than the threshold. 3.12.1.2 Using only power-of-two window sizes There is a problem with the optimal window size (O-WS) proposed in Section 3.7: each pitch candidate has its own, which means that a different STFT must be computed for each candidate. If we separate the candidates at a di stance of 1/8 semitone over a range of 5 octaves (appropriate for music, for exampl e), we will need to compute 8*12*5 = 480 STFTs for each

PAGE 75

75 pitch estimate. Not only that, for some WSs it may be inefficient to use an FFT (recall that the FFT is more efficient for windows sizes that are powers of two). To alleviate this problem, we propose to s ubstitute the O-WS with the power-of-two (P2) WS that produces the maximum K+-NIP between the square-root of the main lobe of the spectrum and the cosine kernel. To find such a WS, it is convenient to have a closed-form formula for the K+-NIP of these functions, but this invol ves integrating the product of a cosine and the square-root of the sum of three sinc functions, which is anal ytically intractable. As an alternative, we approximate the square -root of the spectral lobe with an idealized spectral lobe (ISL) consisting of the function it approximates: a positive cosine lobe. Figure 3-17 shows a K+-normalized cosine whose pos itive part has a width of f /2 (i.e., the cosine template used by an f Hz pitch candidate), and tw o normalized ISLs whose widths are half and twice the width of the positive part of the cosine. Since th e cosine and the ISLs are symmetric around zero, the K+-NIP can be computed using only th e positive frequencies. Hence, the K+-NIP Figure 3-17. Idealized spectral lobes.

PAGE 76

76 of the central positive lobe of a cosine with period rf (the ISL) and a cosine with period f (the template) can be computed as 2 / 1 4 / 0 2 2 / 1 4 / 0 2 4 / 0' ) / 2 ( cos ) / 2 ( cos ) / 2 cos( ) / 2 cos( ) ( f r f r fdf f f df f rf df f f f rf r P 8 / 8 / / ) 1 ( 2 cos / ) 1 ( 2 cos 2 12 / 1 2 / 1 4 / 0f r f df f f r f f rr f r f f r r f f r rr f f f 1 / ) 1 ( 2 sin 1 / ) 1 ( 2 sin 24 / 0 1 2 / ) 1 ( sin 1 2 / ) 1 ( sin 2 r r r r r r r (3-16) It is convenient to transform the input of this function to a base-2 logarithmic scale, = log2( r ), and then redefine the function as 2 1 2 / ) 1 2 ( sin 2 1 2 / ) 1 2 ( sin 2 ) (2 / 1 (3-17) Figure 3-18A shows () for between -1 and 1 (i.e., r = 2 between 1/2 and 2). As departs from zero, () departs from 1, as expected. However, the distribution is not symmetric: a decrease in has a larger effect on () than an increase in This make sense since a decrease in corresponds to a widening of the ISL, which puts part of it in the region where the cosine template is negative (see wider ISL in Fi gure 3-1), producing a large decrease in (). On the other hand, narrowing the ISL keeps it in the pos itive region of the cosine template, producing a smaller decrease in ().

PAGE 77

77 Figure 3-18. K+-normalized inner product between temp late and idealized spectral lobes. Figure 3-18A can be helpful in finding the P2-WS that produces the largest K+-NIP between the ISL and the template. If the O-WS for the template is T* seconds and the sampling rate is fs, then the O-WS in samples is N* = T fs, which correspond to = 0 in the figure. Smaller ’s correspond to smaller WSs, and larger ’s correspond to la rger WSs. In general, the WS in number of samples, denoted N and are related through the equation N = 2N*. It is straightforward to show that the two ’s that correspond to th e two closest P2-WSs to the optimal must be between -1 and 1, and not on ly that, their difference must be 1. Figure 3-18B shows the difference between () and ( 1) as a function of for between 0 and 1. From the figure we can infer that, for ’s between 0 and 0.56, we should use the larger P2-WS, and for between 0.56 and 1, we should use the smaller P2-WS. However, Figure 3-18B shows also that there is not much loss in the K+-NIP by choosing 0.5 as threshold rather than 0.56. Therefore, to simplify the algorithm, we decided to set the thres hold at 0.5. In other words, to determine the

PAGE 78

78 Figure 3-19. Individu al and combined pitch strength curves. P2-WS to use for a pitch candidate, we transform the O-WS and the P2-WSs to a logarithmic scale, and choose the P2-W S closest to the optimal. Unfortunately, this approach pr oduces discontinuities in the p itch strength (PS) curves, as illustrated in Figure 3-19A. The PS values marked with a plus sign were produced using a WS larger than the WS than the ones marked with a circle. To emphasize the e ffect, the pitch of the signal (220 Hz) was chosen to ma tch the point at which the cha nge of WS occurs. Since the PS values produced by the larger window in the neighborhood of the pitch are larger than the ones produced by the smaller window, the pitch co uld be biased toward a lower value. Although an effort was made to find an approp riate value for the threshold, it was based on an idealized spectrum, which does not have the side lobes found in real spectra. This problem can be alleviated by using a threshold larger than 0.56, determined through trial and error, but we

PAGE 79

79 found a better solution: to compute the PS as a lin ear combination of the PS values produced by the two closest P2-WSs, where th e coefficients of the combinat ion are proportional to the logdistance between the P2-WSs and the O-WS. Concretely, to determine the P2-WSs used to compute the PS of a candidate with frequency f Hz, the O-WS is written as a power of two, N* = 2L + where L is an integer and 0 < 1. Then, the PS values S0( f ) and S1( f ) are computed using windows of size 2L and 2L +1, respectively. Finally, these PSs are combined into a single one to produce the final PS ) ( ) ( ) 1 ( ) (1 0f S f S f S (3-18) Figure 3-19B shows how this combination of PS curves smoothes the discontinuity found in Figure 3-19A. It would be interesting to know how much is lost in PS by using the formula proposed in Equation 3-18, when the O-WS is not a power of two. This lost can be approximated by finding Figure 3-20. Pitch strength loss when using suboptimal window sizes.

PAGE 80

80 the minimum of the linear combination (1-) () + (-1) for 0 < < 1, which is plotted in Figure 3-20. It can be seen that it has a minimum of 0.93 at around = 0.4. Therefore, the maximum loss when computing PS using the two closest P2-WSs is 7%. Since the minimum PS of a sawtooth waveform when using an O-WS is about 0.92 for SWIPE and 0.83 for SWIPE (see Figure 3-15), the minimum pitch strength of a sawtooth waveform when using the two closest P2-WSs is about 0.86 for SWIPE and 0.77 for SWIPE Besides using an optimal window size for th e FFT computation, the approximation of O-WSs using P2-WSs has another advantage that is probably more important: the same FFT can be shared by several pitch candidates, more prec isely, by all the candidates within an octave of the optimal pitch for that FFT. Going back to the example that starte d this section, the replacement of the O-WS with the closest P2-WSs reduces the number of FFTs required to estimate the pitch from 480 to just 5: a huge save in computation. Using this approach, and translating the algor ithm to a discrete-time domain (necessary to compute an FFT), we can write the SWIPE estimate of the pitch at the discrete-time index as ) ( ) ( ) ( )) ( 1 ( max arg ] [1 ) ( ) (f S f f S f pf L f L f (3-19) where ) ( ) ( ) (*f L f L f (3-20) ) ( ) (*f L f L (3-21) ) / 4 ( log ) (2 *f kf f Ls (3-22)

PAGE 81

81 2 / 1 ) ( ERBs 0 2 2 / 1 ) ( ERBs 0 ) ( ERBs 0 2 / 1 2 2 / 1max max max| )] ( [ ˆ | )) ( ( ) ( 1 | )] ( [ ˆ | )) ( ( ) ( 1 ) ( f m f m f m Lm X m f K m m X m f K m f SL L,(3-23) s N Nf N f N X N I f X / }], 1 ,..., 0 { [ }, 1 ,..., 0 { ] [ ˆ (3-24) N j N Ne x w X/ 2'] [ ] [ ] [, (3-25) is the ERB scale step size (0.1 gives good enough resolution), I ( ,) is an interpolating function that uses the functional relations k = F ( k) to predict the value of F ( ), and XN[,] ( = 0, 1,…, N 1) is the discrete Fourier transform (c omputed via FFT) of the discrete signal x [ ], multiplied by the sizeN windowing function wN[ ], centered at The other variables, constants, and functions are define d as before (see Section 3.8). A Matlab implementation of this algorithm is given in Appendix A. 3.10.2 Reducing the Number of Spectral Integral Transforms The pitch resolution of SWIPE and SWIPE depends on the granularity of the pitch candidates. Therefore, to achieve high pitch reso lution, a large number of pitch candidates must be used, and since the pitch strength of each candidate is determined by computing a K+-NIP between its kernel and the spec trum, the computational cost of the algorithm would increase enormously. To avoid this situ ation, we propose to compute K+-NIPs only for certa in candidates, and then use interpolation to estimate th e pitch strength of the other candidates. As noted by de Cheveign (2002), the AC of a signal is the Fourier tr ansform of its power spectrum, and therefore the AC is a sum of cosines that can be approximated around zero by using a Taylor series expansion with even powers. If the signal is periodic, its AC is also

PAGE 82

82 periodic, and therefore the shape of the AC around the pitch period is the same as the shape around zero, and therefore it can also be approximate d by the same Taylor series, centered at the pitch period. If the width of th e spectral lobes is narrow and the energy of the high frequency components is small, the terms of order 4 in th e series vanish as the independent variable approaches the pitch period, and therefore the series can be approximated using a parabola. Since SWIPE perform an inner product between the spectrum and a kernel consisting of cosine lobes, a similar argument can be applied to the pitch strength curves produce by SWIPE. However, the quality of the fit of a parabola is not guaranteed for two reasons: first, the width of the spectral lobes produced by SWIP E are not narrow, in fact, they are as wide as the positive lobes of the cosine; and second, the use of the square-root of th e spectrum rather than its energy makes the contribution of the high frequency components large, vi olating the requirement of low contribution of high frequency components. Nevertheless, para bolic interpolation produces a good fit to the pitch strength curve in the neighb orhood of the SWIPE peaks, as we will proceed to show. Let’s derive an approximation to the pitch strength curve ( t ) produced by SWIPE for a sawtooth waveform with fundamental frequency f0 = 1/ T0 Hz in the neighborhood of the pitch period T0. To simplify the equations, let’s de fine the scaling transformations = 2f and = 2t / T0. To make the calculations tractable, let’s us e idealized spectral lobes (i.e. cosine lobes) and let’s ignore the normalization factors and the change of widt h of the spectral lobe with change of window size caused by a change of pitch candidate. Le t’s also repla ce the continuous decaying envelope of the kernel with a decayi ng step function that gives a weight of 1/ k to the k -th harmonic. With all this simplifications, the pitch strength of a candida te with scaled pitch

PAGE 83

83 period in the neighborhood of 2 (i.e., when the non-scaled pitch period t is in the neighborhood of T0) can be approximated as n k k 1) ( ) ( (3-26) where 4 / 1 4 / 1) 2 cos( ) cos( 1 ) (k k kd w k t 4 / 1 4 / 1) 2 ( cos ) 2 ( cos 2 1k kd t t k 2 ) 2 ( sin 2 ) 2 ( sin 2 14 / 1 4 / 1 t t t t kk k 2 ) 2 )( 4 / 1 ( sin ) 2 )( 4 / 1 ( sin 2 1 t t k t k k 2 ) 2 )( 4 / 1 ( sin ) 2 )( 4 / 1 ( sin t t k t k (3-27) Since we are interested in approximati ng this function in the neighborhood of 2, we can equivalently shift the function 2 units to the left by defining k ( ) = k ( +2 ), and then approximate k ( ) in the neighborhood of zero. Since sin( x ) / x = 1 x2/3! + x4/5! O ( x6) in the neighborhood of zero, it is useful to express k ( ) as ) 4 / 1 ( ) 4 / 1 ( sin 2 4 / 1 ) 4 / 1 ( ) 4 / 1 ( sin 2 4 / 1 ) ( k k k k k k k kk 4 ) 4 / 1 ( sin ) 4 / 1 ( sin 2 1 k k k (3-28) which has the Taylor series expansion

PAGE 84

84 6 4 4 2 2! 5 ) 4 / 1 ( 3 ) 4 / 1 ( 1 2 4 / 1 ) ( O k k k kk 6 4 4 2 2! 5 ) 4 / 1 ( 3 ) 4 / 1 ( 1 2 4 / 1 O k k k k 5 3 3! 3 ) 4 / 1 ( ) 4 / 1 ( ) 4 ( 2 1 O k k k 5 3 3! 3 ) 4 / 1 ( ) 4 / 1 ( O k k (3-29) in the neighborhood of zero. Finally, the approximati on of the pitch strength curve in the shiftedtime domain is n k kO a a a a a1 5 4 4 3 3 2 2 1 0) ( ) ( (3-30) Figure 3-21. Coefficients of the pitc h strength interpolation polynomial.

PAGE 85

85 Figure 3-21 shows the relative valu e of the coefficients of the expansion as a function of the number of harmonics in the signal. As the number of harmonics increases, the relative weight of the order-4 coefficient increases. However, as approaches zero, its fourth power becomes so small that its overall contribution to the sum is sm all compared to the contribution of the order-2 term. This effect is clear in Figure 3-22, which shows ( ) for a sawtooth waveform with 15 harmonics using polynomials of orde r 2 and order 4 in the range +/ 0.045, which corresponds to +/ 1/8 semitones. The curve has been scaled to have a maximum of 1. The large circles correspond to candidates separated by 1/8 semit ones, which is the in terval used in our implementation of SWIPE and SWIPE for the distance between p itch candidates for which the pitch strength is computed directly. The ot her markers correspond to candidates separated by 1/64 semitones, which is the resolution used to fine tune the pitch strength curve based on the pitch strength of the candidates fo r which the pitch strength is co mputed directly. As observed in Figure 3-22. Interpolated pitch strength.

PAGE 86

86 the figure, for such small values of the pitch strength values obtained with an order 2 polynomial (squares) are indistinguishable from the ones obtained with an order 4 polynomial (diamonds). Hence, a parabola is good enough to estimate the pitch streng th between candidates separated at distances as small as 1/8 semitones. 3.11 Summary This chapter described the SWIPE algorithm and its variation SWIPE The initial approach of the algorithm was the search for th e frequency that maximizes the average peak-tovalley distance at harmonic locati ons. Several modifications to this idea were applied to improve its performance: the locations of the harmonics were blurred, the spectral amplitude and the frequency scale were warped, an appropria te window type and si ze were chosen, and simplifications to reduce computational cost were introduced. After these modifications, SWIPE estimates the pitch as the fundamental frequenc y of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. Its variation, SWIPE uses only the first and prime harmonics of the signal.

PAGE 87

87 CHAPTER 4 EVALUATION To asses the relevance of SWIPE and SWIPE they were compared against other algorithms using two speech databases and a musical instruments database. This chapter presents a brief description of these algorithms, database s, and the evaluation process. A more detailed description is given in Appendix B. 4.1 Algorithms The algorithms with which SWIPE and SWIPE were compared were the following: AC-P: This algorithm (Boersma, 1993) comput es the autocorrelati on of the signal and divides it by the autocorrelati on of the window used to anal yze the signal. It uses postprocessing to reduce discontinuities in the pitch trace. It is available with the Praat System at http://www.fon.hum.uva.nl/praat The name of the function is ac AC-S: This algorithm uses the autocorrelation of the cubed signal. It is available with the Speech Filing System at http://www.phon.ucl.ac.uk/resource/sfs The name of the function is fxac ANAL: This algorithm (Secrest and Doddington, 1983) uses autocorrelation to estimate the pitch, and dynamic programming to remove di scontinuities in the pitch trace. It is available with the Speech Filing System at http://www.phon.ucl.ac.uk/resource/sfs The name of the function is fxanal CATE: This algorithm uses a quasi autocorrela tion function of the speech excitation signal to estimate the pitch. We implemented it ba sed on its original description (Di Martino, 1999). The dynamic programming component used to remove discontinuities in the pitch trace was not implemented. CC: This algorithm uses crosscorrelation to estimate the pitch and post-processing to remove discontinuities in the pitch trace. It is available with the Praat System at http://www.fon.hum.uva.nl/praat The name of the function is cc CEP: This algorithm (Noll, 1967) uses the cepstru m of the signal and is available with the Speech Filing System at http://www.phon.ucl.ac.uk/resource/sfs The name of the function is fxcep ESRPD: This algorithm (Bagshaw, 1993; Medan, 1991) uses a normalized cross-correlation to estimate the pitch, a nd post-processing to remove discontinuities in the pitch trace. It is available with the Festival Speech Filing System at http://www.cstr.ed.ac.uk/projects/festival The name of the function is pda

PAGE 88

88 RAPT: This algorithm (Secrest and Doddingt on, 1983) uses a normalized crosscorrelation to estimate the pitch, and dynami c programming to rem ove discontinuities in the pitch trace. It is availabl e with the Speech Filing System at http://www.phon.ucl.ac.uk/resource/sfs The name of the function is fxrapt SHS: This algorithm (Hermes, 1988) uses subharm onic summation. It is available with the Praat System at http://www.fon.hum.uva.nl/praat The name of the function is shs SHR: This algorithm (Sun, 2000) uses the subharm onic-to-harmonic ratio. It is available at Matlab Central http://www.mathworks.com/matlabcentral under the title “Pitch Determination Algorithm”. The name of the function is shrp TEMPO: This algorithm (Kawahara et al., 1999) uses the instantaneous frequency of the outputs of a filterbank. It is available with the STRAIGHT System at its author web page http://www.wakayama-u.ac.jp/~kawahara The name of the function is exstraightsource YIN: This algorithm (de Cheveign and Kawaha ra, 2002) uses a modified version of the average squared difference function. It is available from its author web page at http://www.ircam.fr/pcm/cheveign/sw/yin.zip The name of the function is yin 4.2 Databases The databases used to test th e algorithms were the following: DVD: Disordered Voice Database This database contains 657 samples of sustained vowels produced by persons with disordered voice. It can be bought from Kay Pentax http://www.kayelemetrics.com KPD: Keele Pitch Database This speech database was collect ed by Plante et. al (1995) at Keele University with the purpose of evalua ting pitch estimation algorithms. It contains about 8 minutes of speech spoken by five ma les and five females. Laryngograph data was recorded simultaneously with speech, a nd was used to produce estimates of the fundamental frequency. It is publicly available at ftp://ftp.cs.keele.ac.uk/pub/pitch MIS: Musical Instruments Samples This database contains more than 150 minutes of sound produced by 20 different musical instruments. It was collected at the University of Iowa Electronic Music Studios, directed by Lawrence Fritts, a nd is publicly available at http://theremin.music.uiowa.edu PBD: Paul Bagshaw’s Database for evalua ting pitch determination algorithms This database contains about 8 minutes of speech spoken by one male and one female. Laryngograph data was recorded simultaneously with speech, and was used to produce estimates of the fundamental frequency. It was collected by Pa ul Bagshaw at the University of Edinburg (Bagshaw et. al 1993; Bagshaw 1994), and is publicly available at http://www.cstr.ed.ac.uk/research/projects/fda

PAGE 89

89 4.3 Methodology The algorithms were asked to produce a pitch estimate every millisecond. The search range was set to 40-800 Hz for speech and 30-1666 Hz fo r musical instruments. The algorithms were given the freedom to decide if the sound was pitc hed or not. However, to compute our statistics, we considered only the time instants at which all the algorithms agreed that the sound was pitched. Special care was taken to account for time mi salignments. Specifically, the pitch estimates were associated to the time corresponding to the center of their respective analysis windows, and when the ground truth pitch varied over time (i.e ., for PBD and KPD), the estimated pitch time series were shifted within a range of 100 ms to find the best alignment with the ground truth. The performance measure used to compare the algorithms was the gross error rate (GER). A gross error occurs when the estimated pitch is off from the reference pitch by more than 20%. At first glance this margin of error seems too larg e, but considering that mo st of the errors pitch estimation algorithms produce are octave errors (i .e., halving or doubling the pitch), this is a reasonable metric. On the other hand, this tolera nce gives room for deal ing with misalignments. The GER measure has been used previously to test PEAs by other rese archers (Bagshaw, 1993; Di Martino, 1999; de Chev eigne and Kawahara, 2002). 4.4 Results Table 4-1 shows the GERs for each of the algorithms over each of the speech databases. Both the rows and the columns are sorted by average GER: the best algorithms are at the top, and the more difficult databases are at the right. The best algorithm overall is SWIPE followed by SHS and SWIPE. Although on average SHS performs better than SWIPE, the only database in which SHS beats SWIPE is in the disordered voice database, which indicates that SWIPE performs better than SHS on normal speech.

PAGE 90

90 Table 4-1. Gross error rates for speech* Gross error (%) Algorithm PBD KPD DVD Average SWIPE 0.130.830.63 0.53 SHS 0.151.001.10 0.75 SWIPE 0.150.871.70 0.91 RAPT 0.751.002.40 1.40 TEMPO 0.321.902.00 1.40 YIN 0.331.404.50 2.10 SHR 0.691.505.10 3.50 ESRPD 1.403.904.60 5.00 CEP 6.104.2014.00 5.90 AC-P 0.732.9016.00 6.70 CATE 2.6010.007.20 6.60 CC 0.483.605.00 2.40 ANAL 0.832.0035.00 13.00 AC-S 8.807.0040.00 19.00 Average 1.703.009.90 4.90 Values computed using two significant digits. Table 4-2. Proportion of overestimation e rrors relative to total gross errors* Proportion of overestimations Algorithm DVD PBD KPD Average CC 0.00.00.1 0.0 SHS 0.00.00.3 0.1 RAPT 0.00.10.5 0.2 SHR 0.00.40.3 0.2 AC 0.00.40.2 0.2 AC 0.00.20.3 0.2 ANAL 0.00.50.4 0.3 CEP 0.40.50.4 0.4 SWIPE 0.00.60.7 0.4 SWIPE 0.10.60.7 0.4 YIN 0.10.90.5 0.5 TEMPO 0.10.80.9 0.6 CATE 0.50.50.8 0.6 ESRPD 0.50.70.9 0.7 Average 0.10.40.5 0.3 Values computed using one significant digit. Table 4-2 shows the proportion of GEs caused by overestimations of the pitch with respect to the total number of GEs. Th e proportion of GEs caused by underes timation of the pitch is just

PAGE 91

91 Table 4-3. Gross error rates by gender* Gross error (%) Algorithm Male Female Average SWIPE 0.362.401.4 SHS 0.552.501.5 SWIPE 0.492.701.6 RAPT 0.422.901.7 TEMPO 0.673.101.9 SHR 0.613.602.1 YIN 1.103.202.2 AC-P 2.103.602.9 CEP 1.804.203.0 CC 2.404.503.5 ESRPD 3.103.903.5 ANAL 1.305.903.6 AC-S 3.2010.006.6 CATE 11.004.207.6 Average 2.104.003.1 Values computed using two significant digits. one minus the values shown in the table. Algorith ms at the top have a tendency to underestimate the pitch while algorithms at the bottom have a te ndency to overestimate it. Most algorithms tend to underestimate the pitch in the di sordered voice database while th e errors are more balanced in the normal speech databases. Table 4-3 shows the pitch estimation perfor mance as a function of gender for the two databases for which we had access to this information: PVD and KPD. The error rates are on average larger for female speech than for male speech. Table 4-4 shows the GERs for the musical inst ruments database. Some of the algorithms were not evaluated on this database because they did not provide a mechanism to set the search range, and the range they covered was smaller that the pitch range spanned by the database. The two algorithms that perfor med the best were SWIPE and SWIPE.

PAGE 92

92 Table 4-4. Gross error rates for musical instruments* Gross error (%) Algorithm Underestimates Overestimates Total SWIPE 1.000.101.10 SWIPE 1.300.021.30 SHS 0.881.001.90 TEMPO 0.291.702.00 YIN 1.600.832.40 AC-P 3.200.003.20 CC 3.600.003.60 ESRPD 5.301.506.80 SHR 15.005.3020.00 Average 3.601.204.70 Values computed using two significant digits. Table 4-5. Gross error ra tes by instrument family* Gross error (%) Algorithm Brass Bowed Strings WoodwindsPiano Plucked Strings Average SWIPE' 0.01 0.190.142.208.80 2.30 SWIPE 0.00 0.220.230.0211.00 2.30 TEMPO 0.00 2.601.407.304.00 3.10 YIN 0.03 1.101.500.3614.00 3.40 SHS 0.02 1.500.7212.008.10 4.50 AC-P 0.03 0.560.800.3626.00 5.60 CC 0.07 0.831.000.3628.00 6.00 ESRPD 4.00 6.907.106.0011.00 7.00 SHR 22.00 25.0038.0026.0015.00 25.00 Average 2.90 4.305.606.1014.00 6.60 Values computed using two significant digits. Brass: French horn, bass/tenor trombones, trumpet, and tuba. Bowed strings: double bass, cello, viola, and violin. W oodwinds: flute, bass/alto flutes, bass/Bb/Eb clarinets, alto/soprano saxes. Plucked strings: cello and violin. Table 4-5 shows the GERs by instrument fa mily. The two best algorithms are SWIPE and SWIPE. SWIPE tends to perform better than SWIPE except for the piano, for which SWIPE produces almost no error. On the other hand, SWIPE performance on piano is relatively bad compared to correlation based algorithms. The fam ily for which fewer errors were obtained was the brass family; many algorithms achieved almost perfect performance for this family. The

PAGE 93

93 Table 4-6. Gross error rates fo r musical instruments by octave* Gross error (%) Algorithm 46.2 Hz +/1/2 oct. 92.5 Hz +/1/2 oct. 185 Hz +/1/2 oct. 370 Hz +/1/2 oct. 740 Hz +/1/2 oct. 1480 Hz +/1/2 oct. Average SWIPE' 1.20 1.00 2.300.890.130.29 0.97 SWIPE 0.08 1.20 3.001.000.250.38 0.99 YIN 3.20 0.95 5.301.800.690.96 2.20 AC-P 0.24 2.00 7.802.500.710.30 2.30 SHS 7.80 2.60 3.201.200.230.14 2.50 CC 0.26 2.60 8.202.700.930.40 2.50 TEMPO 15.00 2.80 2.001.100.520.31 3.60 ESRPD 7.90 2.60 4.804.2012.0032.00 11.00 SHR 37.00 0.60 1.8027.0070.0081.00 36.00 Average 8.10 1.80 4.304.709.5013.00 6.90 Values computed using two significant digits. family for which more errors were produced was the strings family playing pizzicato i.e., by plucking the strings. Indeed, pi zzicato sounds were the ones for which the performers produced more errors and the ones that were hard est for us to labe l (see Appendix B). Table 4-6 shows the GERs as a function of octave. The best perf ormance on average was achieved by SWIPE and SWIPE. The results of the algorit hms with an average GER less than 0.0 0.1 1.0 10.0 100.0 46.292.51853707401480 Pitch (Hz)GER (%) SWIPE' SWIPE YIN AC-P SHS CC TEMPO Figure 4-1. Gross error rates for musical instruments as a function of pitch.

PAGE 94

94 Table 4-7. Gross error rates fo r musical instruments by dynamic* Gross error (%) Algorithm pp mf ff Average SWIPE' 1.301.200.92 1.10 SWIPE 1.401.401.20 1.30 SHS 1.502.302.00 1.90 TEMPO 2.001.902.00 2.00 YIN 2.202.502.40 2.40 AC-P 3.303.203.30 3.30 CC 3.603.303.80 3.60 ESRPD 5.707.107.60 6.80 SHR 27.0029.0029.00 28.00 Average 5.305.805.80 5.60 Values computed using two significant digits. 10% is reproduced in Figure 4-1. All algorithms have approximately the same tendency, except at the lowest octave, where a larger va riance in the GERs can be observed. Table 4-7 shows the GERs as a function of dyna mic (i.e., loudness). In general, there is no significant variation of GERs with changes in loudness, although SWIPE has a tendency to reduce the GER as loudness increases [i.e., as the dynamic moves from pianissimo ( pp ) to fortissimo ( ff ) ]. As a final test, we wanted to validate the choi ces we made in Chapter 3, i.e., shape of the kernel, warping of the spectrum, weighting of th e harmonics, warping of the frequency scale, and selection of window type and size. Fo r this purpose, we evaluated SWIPE replacing every time one of its features with a more standard featur e, i.e., smooth vs. pulsed kernels, square-root vs. raw spectrum, decaying vs. flat kernel envelo pe, ERB vs. Hertz frequency scale, and pitchoptimized vs. fixed window size. We varied each of these variables in dependently and obtained the results shown in Table 4-8. Although some of the variations made SWIPE improve in some of the databases, overall SWIPE worked better with the features we proposed in Chapter 3.

PAGE 95

95 Table 4-8. Gross error rate s for variations of SWIPE Gross error (%) Variation PBD KPD DVD MIS Average Original 0.13 0.830.631.10 0.67 Flat envelope 0.16 1.001.400.60 0.79 Hertz scale1 0.23 1.701.400.37 0.93 Pulsed kernel 0.21 0.843.002.60 1.70 Raw spectrum2 0.25 2.101.604.90 2.20 Fixed WS3 0.15 0.771.709.10 2.90 Values computed using two significant digits. 1 FFTs were computed using optimal window sizes and the spectrum was inter/extrapolated to frequency bins separated at 5 Hz.2 The use of the raw spectrum rather than the square root of the spectrum implies the us e of a kernel whose envelope decays as 1/ f rather than 1/ f to match the spectral envelope of a sawtooth waveform.3 The power-of-two window size whose optimal pitch was closest to the geometric mean pitch of the database was used in each case. A window of size 1024 samples was used for the speech databases and a window of size of 256 samples was used for the musical instruments database. 4.5 Discussion SWIPE showed the best performance in all cat egories. SWIPE was the second best ranked for musical instruments and normal speech bu t not for disordered speech, for which SHS performed better (see Table 4-1). One possible reas on is that it is common for disordered voices to have energy at multiples of subharmonics of the pitch, and therefore algorithms that apply negative weights to the spectral regions between harmonics (e.g., SWIPE, SWIPE and all autocorrelation based algorithms) are prone to produce low scores for the pitch. Although SWIPE is among this group, its use of only the firs t and prime harmonics, reduces substantially the score subharmonics of the pitc h, producing most of the time a la rger score for the pitch than for its subharmonics. The rankings of the algorithms are relatively st able in all the tables except for SHR, which showed a good performance for speech but not fo r musical instruments. We believe this is caused by the wide pitch range spanned by the musi cal instruments. This is suggested by the results in Table 4-6, which show that SHR pe rforms well in the octaves around 92.5 Hz and 185

PAGE 96

96 Hz, which corresponds to the pitch region of spe ech, but performs very bad as the pitch moves from this region. Figure 4-1 shows that the relati ve trend on performance with pitch for musical instruments is about the same for all the algorithms except in the lowest region, where a large variance in performance was observed. However, this vari ance may be caused by a significant reduction in the numbers of samples in this region (about 4% of the data). The figure also shows an overall increase in GER in the octave around 185 Hz. We believe this is caused by the presence of a set of difficult sounds in the database wi th pitches in that region, since it is hard to be lieve that there is an inherent difficulty of the algorit hms to recognize pitc h in that region.

PAGE 97

97 CHAPTER 5 CONCLUSION The SWIPE pitch estimator has been develo ped. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform whose spectrum best matches the spectrum of the input signal. The schematic description of the algorithm is the following: 1. For each pitch candidate f within a pitch range fminfmax, compute its pitch strength as follows: a. Compute the square-root of the spectrum of the signal. b. Normalize the square-root of the spectrum and apply an integral transform using a normalized cosine kernel whose envelope decays as 1/ f 2. Estimate the pitch as the candidate with highest strength. An implicit objective of the algorithm was to find the frequency for which the average peak-to-valley distance at its harm onics is maximized. To achieve th is, the kernel was set to zero below the first negative lobe and above the last negative lobe, and to avoid bias, the magnitude of these two lobes was halved. To make the contribution of each harmonic of the sawtooth waveform proportional to its amplitude and not to the square of its amplit ude, the square-root of the spectrum was taken before applying the in tegral transform. To make the kernel match the normalized squa re-root spectrum of the sawtooth waveform, a 1/ f envelope was applied to the kernel. The kernel was nor malized using only its positive part. To maximize the similarity betw een the kernel and the square -root of the input spectrum, each pitch candidate required its own window size, which in general is not a power of two, and therefore not ideal to compute an FFT. To re duce computational cost, the two closest power-oftwo window sizes were used, and their results are combined to produce a single pitch strength value. This had the extra advantage of allowing an FFT to be shared by many pitch candidates.

PAGE 98

98 Another technique used to reduce computational cost was to compute a coarse pitch strength curve and then fine tune it by using parabolic in terpolation. A last technique used to reduce computational cost was to reduce the amount of window overlap while allowing the pitch of a signal as short as four cycles to be recognized. The ERB frequency scale was used to compute the spectral integral transform since the density of this scale decrease s almost proportionally to fre quency, which avoids wasting computation in regions where there little spectral ener gy is expected. SWIPE a variation of SWIPE, uses only the fi rst and prime harmonics of the signal, producing a large reduction in subharmonic errors by reducing si gnificantly the scores of subharmonics of the pitch. Except for the obvious architectural decisions that must be taken when creating an algorithm (e.g., selection of the kernel), ther e are no free parameters in SWIPE and SWIPE at least in terms of “magic numbers”. SWIPE and SWIPE were tested using speech and musical instruments databases and their performance was compared against twelve othe r algorithms which have been cited in the literature and for which free implementations exist. SWIPE was shown to outperform all the algorithms on all the databases. SWIPE was ranked second in the normal speech and musical instruments databases, and was ranked th ird in the disordered speech database.

PAGE 99

99 APPENDIX A MATLAB IMPLEMENTATION OF SWIPE This is a Matlab implementation of SWIPE To convert it into SWIPE just replace [ 1 primes(n) ] in the for loop of the function pitchStrengthOneCandidate with [ 1:n ]. function [p,t,s] = swipep(x,fs,plim,dt,dlog2p,dERBs,sTHR) % SWIPEP Pitch estimation using SWIPE'. % P = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2F,DERBS,STHR) estimates the pitch of % the vector signal X with sampling frequency Fs (in Hertz) every DT % seconds. The pitch is estimated by sampling the spectrum in the ERB scale % using a step of size DERBS ERBs. The pitch is searched within the range % [PMIN PMAX] (in Hertz) sampled every DLOG2P units in a base-2 logarithmic % scale of Hertz. The pitch is fine tuned by using parabolic interpolation % with a resolution of 1/64 of semitone (approx. 1.6 cents). Pitches with a % strength lower than STHR are treated as undefined. % % [P,T,S] = SWIPEP(X,Fs,[PMIN PMAX],DT,DLOG2P,DERBS,S/thr) returns the times % T at which the pitch was estimated and their corresponding pitch strength. % % P = SWIPEP(X,Fs) estimates the pitch using the default settings PMIN = % 30 Hz, PMAX = 5000 Hz, DT = 0.01 s, DLOG2P = 1/96 (96 steps per octave), % DERBS = 0.1 ERBs, and STHR = -Inf. % % P = SWIPEP(X,Fs,...[],...) uses the default setting for the parameter % replaced with the placeholder []. % % EXAMPLE: Estimate the pitch of the signal X every 10 ms within the % range 75-500 Hz using the default resolution (i.e., 96 steps per % octave), sampling the spectrum every 1/20th of ERB, and discarding % samples with pitch strength lower than 0.4. Plot the pitch trace. % [x,Fs] = wavread(filename); % [p,t,s] = swipep(x,Fs,[75 500],0.01,[],1/20,0.4); % plot(1000*t,p) % xlabel('Time (ms)') % ylabel('Pitch (Hz)')if ~ exist( 'plim' ) | isempty(plim), plim = [30 5000]; end if ~ exist( 'dt' ) | isempty(dt), dt = 0.01; end if ~ exist( 'dlog2f' ) | isempty(dlog2f), dlog2f = 1/96; end if ~ exist( 'dERBs' ) | isempty(dERBs), dERBs = 0.1; end if ~ exist( 'sTHR' ) | isempty(sTHR), sTHR = -Inf; end t = [ 0: dt: length(x)/fs ]'; % Times dc = 4; % Hop size (in cycles) K = 2; % Parameter k for Hann window % Define pitch candidates log2pc = [ log2(plim(1)): dlog2f: log2(plim(end)) ]'; pc = 2 .^ log2pc; S = zeros( length(pc), length(t) ); % Pitch strength matrix % Determine P2-WSs logWs = round( log2( 4*K fs ./ plim ) ); ws = 2.^[ logWs(1): -1: logWs(2) ]; % P2-WSs pO = 4*K fs ./ ws; % Optimal pitches for P2-WSs % Determine window sizes used by each pitch candidate d = 1 + log2pc log2( 4*K*fs./ws(1) );

PAGE 100

100 % Create ERBs spaced frequencies (in Hertz) fERBs = erbs2hz([ hz2erbs(pc(1)/4): dERBs: hz2erbs(fs/2) ]'); for i = 1 : length(ws) dn = round( dc fs / pO(i) ); % Hop size (in samples) % Zero pad signal xzp = [ zeros( ws(i)/2, 1 ); x(:); zeros( dn + ws(i)/2, 1 ) ]; % Compute spectrum w = hanning( ws(i) ); % Hann window o = max( 0, round( ws(i) dn ) ); % Window overlap [ X, f, ti ] = specgram( xzp, ws(i), fs, w, o ); % Interpolate at equidistant ERBs steps M = max( 0, interp1( f, abs(X), fERBs, 'spline', 0) ); % Magnitude L = sqrt( M ); % Loudness % Select candidates that use this window size if i==length(ws); j=find(d-i>-1); k=find(d(j)-i<0); elseif i==1; j=find(d-i<1); k=find(d(j)-i>0); else j=find(abs(d-i)<1); k=1:length(j); end Si = pitchStrengthAllCandidates( fERBs, L, pc(j) ); % Interpolate at desired times if size(Si,2) > 1 Si = interp1( ti, Si', t, 'linear', NaN )'; else Si = repmat( NaN, length(Si), length(t) ); end lambda = d( j(k) ) i; mu = ones( size(j) ); mu(k) = 1 abs( lambda ); S(j,:) = S(j,:) + repmat(mu,1,size(Si,2)) .* Si; end % Fine-tune the pitch using parabolic interpolation p = repmat( NaN, size(S,2), 1 ); s = repmat( NaN, size(S,2), 1 ); for j = 1 : size(S,2) [ s(j), i ] = max( S(:,j) ); if s(j) < sTHR continue, end if i==1, p(j)=pc(1); elseif i==length(pc), p(j)=pc(1); else I = i-1 : i+1; tc = 1 ./ pc(I); ntc = ( tc/tc(2) 1 ) 2*pi; c = polyfit( ntc, S(I,j), 2 ); ftc = 1 ./ 2.^[ log2(pc(I(1))): 1/12/64: log2(pc(I(3))) ]; nftc = ( ftc/tc(2) 1 ) 2*pi; [s(j) k] = max( polyval( c, nftc ) ); p(j) = 2 ^ ( log2(pc(I(1))) + (k-1)/12/64 ); end end function S = pitchStrengthAllCandidates( f, L, pc ) % Normalize loudness warning off MATLAB:divideByZero L = L ./ repmat( sqrt( sum(L.*L) ), size(L,1), 1 ); warning on MATLAB:divideByZero % Create pitch salience matrix S = zeros( length(pc), size(L,2) ); for j = 1 : length(pc) S(j,:) = pitchStrengthOneCandidate( f, L, pc(j) );

PAGE 101

101 end function S = pitchStrengthOneCandidate( f, L, pc ) n = fix( f(end)/pc 0.75 ); % Number of harmonics k = zeros( size(f) ); % Kernel q = f / pc; % Normalize frequency w.r.t. candidate for i = [ 1 primes(n) ] a = abs( q i ); % Peak's weigth p = a < .25; k(p) = cos( 2*pi q(p) ); % Valleys' weights v = .25 < a & a < .75; k(v) = k(v) + cos( 2*pi q(v) ) / 2; end % Apply envelope k = k .* sqrt( 1./f ); % K+-normalize kernel k = k / norm( k(k>0) ); % Compute pitch strength S = k' L; function erbs = hz2erbs(hz) erbs = 21.4 log10( 1 + hz/229 ); function hz = erbs2hz(erbs) hz = ( 10 .^ (erbs./21.4) 1 ) 229;

PAGE 102

102 APPENDIX B DETAILS OF THE EVALUATION B.1 Databases All the databases used in this work are fr ee and publicly available on the Internet, except the disordered voice database. Besides speech recordings, the speech databases contain simultaneous recordings of laryngograph data, which facilitates the computation of the fundamental frequency. The authors of these data bases used them to produce ground truth pitch values, which are also included in the databases. The disordered voice database includes fundamental frequency estimates, but as it will be explained later, a di fferent ground truth data set was used. The musical instruments database c ontains the names of the notes in the names of the files. B.1.1 Paul Bagshaw’s Database Paul Bagshaw’s database (PBD) for evalua tion of pitch determination algorithms (Bagshaw et. al 1993; Bagshaw 1994) was collected at th e University of Edinburgh, and is available at http://www.cstr.ed.ac.uk/research/projects/fda The speech and laryngograph signals of this database were sampled at 20 kHz. The ground truth fundamental frequency was computed by estimating the location of the glotta l pulses in the laryngogra ph data and taking the inverse of the distance between each pair of consecutive pulses. Each fundamental frequency estimate is associated to the time instant in the middle between the pair of pulses used to derive the estimate. B.1.2 Keele Pitch Database The Keele Pitch Database (KPD) (Plante et. al 1995) was created at Keele University and is available at ftp://ftp.cs.keele.ac.uk/pub/pitch The speech and laryngograph signals were sampled at 20 kHz. The fundament al frequency was estimated by using autocorre lation over a

PAGE 103

103 26.5 ms window shifted at interv als of 10 ms. Windows where the pitch is unclear are marked with special codes. Both of these speech databases PBD and KPD have been reported to contain errors (de Cheveigne, 2002), especially at the end of sentences, where the energy of speech decays and malformed pulses may occur. We will explai n later how we deal with this problem. B.1.3 Disordered Voice Database The disordered voice database (DVD) was collected by Kay Pentax http://www.kayelemetrics.com It includes 657 disordered voi ce samples of the sustained vowel “ah” sampled at 25 kH, and some few at 50 kHz. The database includes samples from patients with a wide variety of organic, neurological, traumatic psychogenic, and other voice disorders. The database includes fundamental frequenc y estimates, but by definition, they do not necessarily match their pitch. Therefore we estimated the pitch by ourselves by listening to the samples through earphones, and matching the pitch to the closest note, using as reference a synthesizer playing sawtooth waveforms. Assuming that we chose one of the two closest notes every time, this procedure should introduce an erro r no larger than 6%, which is smaller than the 20% necessary to produce a GE (see Chapter 4). There were some samples for which the pitch ra nged over a perfect fourth or more (i.e., the higher pitch was more than 33% higher than the lo wer pitch). Since this range is large compared to the permissible 20%, these samples were excl uded. Samples for which the range did not span more than a major third (i.e., the higher pitch was no more than 26% higher than the lower pitch) were preserved, and they were assigned the note corresponding to the median of the range. If the median was between two notes, it was assigned to any of them. This should introduce an error no

PAGE 104

104 larger than two semitones (12%), which is a bout half the maximum permissible error of 20%. There were 30 samples for which we could not perc eive with confidence a pitch, so they were excluded as well. Since the ground truth data was based on the percep tion of only one listener (the author), it could be argued that this data has low validity. To alleviate this, we excluded the samples for which the minimum error produced by a ny algorithm was larger than 50%. After excluding the non-pitch, variable pi tch, and samples at which the algorithms disagreed with the ground truth, we ended up with 612 samples out of the original 657. Appendix C shows the ground truth used for each of these 612 samples. B.1.4 Musical Instruments Database The musical instruments samples database was co llected at the University of Iowa, and is available at http://theremin.music.uiowa.edu The recordings were made using CD quality sampling at a rate of 44,100 kHz, but we downsampled them to 10 kHz in order to reduce computational cost. No noticeable change of per ceptual pitch was perceived by doing this, even for the highest pitch sounds. This database contains recordings of 20 instru ments, for a total of more than 150 minutes and 4,000 notes. The no tes are played in sequ ence using a chromatic scale with silences in between. E ach file usually spans one octave and is labeled with the name of the initial and final notes, plus the name of the instrument, a nd other details (e.g., Violin.pizz.mf.sulG.C4B4.aiff). In order to test the algorithms, the files were split into separate files containing each of them a single note with no leading or trailing silence. This proce ss was done in a semi-automatic way by using a power-based segmentation method, and then checking visually and auditively the quality of the segmentation.

PAGE 105

105 While doing this task it was discovered th at some of the note la bels were wrong. The intervals produced by the performers were sometim es larger than a semitone, and therefore the names of the files did not corres pond to the notes that were in fact played. This situation was common with string instruments, especially when playing in pizzicato Therefore, after splitting the files, we listen ed to each of them, and manually corrected the wrong names by using as reference an electr onic keyboard. This procedure sometimes introduced name conflicts (i.e., there were repeat ed notes played by the same instrument, same dynamic, etc.), and when this occurred, we remove d the repeated notes trying to keep the closest note to the target. When the conflicting notes were equally close to the target, the “best quality” sound was preserved. This removal of files was done to avoid the overh ead of having to add extra symbols to the file names to allow for repetitions, which would have complicated the generation of scripts to test the algorithms. Since this process of manually correcting the names of the notes was very tedious, especially for the pizzicato sounds, after fixing a ll the pizzicato bass and violin notes, the process was abandoned and the cello and viola pizzicato sounds were ex cluded from our evaluation. Arguably, except for the bass, pi zzicato sounds are not very co mmon in music, and therefore leaving the cello and viola pizzicato sounds out did not affect the repr esentativeness of the sample significantly. B.2 Evaluation Using Speech Whenever possible, each of the algorithms was asked to give a pitch estimate every millisecond within the range 40-800 Hz, using the default settings of the algorithm (an exception was made for ESRPD: instead of using the defau lt settings in the Festival implementation, the recommendations suggested by the author of the algorithm were followed). The range 40-800 was used to make the results comparable to the results published by de Cheveigne (2002).

PAGE 106

106 However, a full comparison is not possible since so me other variables were treated differently in that study. The commands issued for each of the algorithms were the following6: AC-P: To Pitch (ac)... 0.001 40 15 no 0.03 0.45 0.01 0.35 0.14 800 AC-S: fxac input_file ANAL: fxanal input_file CC: To Pitch (cc)... 0.001 40 15 no 0.03 0.45 0.01 0.35 0.14 800 CEP: fxcep input_file ESRPD: pda input_file -o output_file -L -d 1 -shift 0.001 -length 0.0384 -fmax 800 -fmin 40 -lpfilter 600 RAPT: fxrapt input_file SHS: To Pitch (shs)... 0.001 40 15 1250 15 0.84 800 48 SHR: [ t, p ] =shrp( x, fs, [40 800], 40, 1, 0.4, 1250, 0, 0 ); SWIPE: [ p, t ] = swipe( x, fs, [40 800], 0.001, 1/96, 0.1, -Inf ); SWIPE : [ p, t ] = swipep( x, fs, [40 800], 0.001, 1/96, 0.1, -Inf ); TEMPO: f0raw = exstraightsource( x, fs ); YIN: p.minf0 = 40; p.maxf0 = 800; p.hop = 20; p.sr = fs; r = yin( x, p ); where x is the input signal and fs is the sampling rate in Hertz. An important issue that had to be consider ed was the time associated to each pitch estimate. Since all algorithms use symmetric window s, a reasonable choice was to associate each estimate to the time at the center of the windo w. For CATE, ESRPD, and SHR, the user is allowed to determine the size of the window, so we followed the recommendation of their authors and we set the window sizes to 51.2, 38.4, and 40 ms, respectively. YIN uses a different window size for each pitch candidate, but the window s are always centered at the same time instant, and the largest window si ze is two periods of the larges t expected pitch period. For the Praat’s algorithms AC-P, CC, and SHS, through trial and error we found that they use windows of size 3, 1, and 2 times the la rgest expected pitch period, resp ectively. For AC-S, ANAL, CEP, RAPT, and TEMPO, the user is not allowed to se t up the window size, but the algorithms output 6 The command for CATE is not reported because we used our own implementation.

PAGE 107

107 the time instants associated to each pitch esti mate, so we used these times hoping that they correspond to the centers of the analysis windows used to determine the pitch. The times associated to the pitch ground trut h series are explicitly given in the PBD database, but not in the KPD database. For KPD, each pitch value was associated to the center of the window. Therefore, since th e ground truth pitch values were computed using 26.5 msec windows separated at a distance of 10 msec, the first pitch estim ate was assigned a time of 13.25 msec, and the time associated to each successive pitch estimate added 10 msec to the time of the previous estimate. For the DVD databases, each vow el was assumed to have a constant pitch, so the ground truth pitch time series was assumed to be constant. The purpose of the evaluation was to compare th e pitch estimates of the algorithms, but not their ability to distinguish the existence of pitc h. Therefore, we include d in the evaluation only the regions of the signal at wh ich all algorithms and the ground tr uth data agreed that pitch existed. To achieve this, we took the time in stants of the ground trut h values and the time instants produced by all the algorithms that es timated the pitch every millisecond (9 out of 13 algorithms), rounded them to the closest multiple of 1 millisecond, and took the intersection. This intersection would form the set of times at which all the algorithms would be evaluated. The algorithms that produced pitch estimates at a rate lower than 1,000 per second were not considered for finding the inters ection because that would reduce the time granularity of our evaluation, which was desire d to be one millisecond. As suggested in the previous paragraph, so me algorithms do not necessarily produce pitch estimates at times that are multip les of one millisecond, i.e., th ey may produce the estimates at the times t + t ms, where t is an integer and | t | < 1. Thus, to evaluate them at multiples of one millisecond, the pitch values at the desired times we re inter/extrapolated in a logarithmic scale.

PAGE 108

108 In other words, we took the logarithm of the estim ated pitches, inter/extrapolated them to the desired times, and took the exponential of the inter/ extrapolated pitches. Inter/extrapolation in the logarithmic domain was preferre d because we believe this is th e natural scale for pitch. This is what allows us to recognize a song ev en if it is sung by a male or a female. An important issue that must be considered when using simultaneous recordings of the laryngograph and speech signals is th at the latter are typically dela yed with respect to the former. An attempt to correct this misalignment was repo rted by the authors of KPD, but the success was not warranted. No attempt of correction was re ported for PBD. Since pitch in speech is timevarying, such misalignment could increase the es timation error significantly. To alleviate this problem, each pitch time series produced by each algorithm was delayed or advanced, in steps of 1 msec, and up to 100 msec, in order to find th e best match with the ground truth data. B.3 Evaluation Using Musical Instruments Considering that many algorithms were design ed for speech, the pitch range of the MIS database is probably too large for them to handle. To alleviate this, we excluded the samples that were outside the range 30-1666 Hz, which is nevert heless large, compared to the pitch range of speech. Since the range 30-1666 Hz was found to be too large for the Speech Filing System algorithms (AC-S, ANAL, CEP, and RAPT) thes e algorithms were not evaluated on the MIS database. The commands issued for each of the algorithms were the following: AC-P: To Pitch (ac)... 0.001 30 15 no 0.03 0.45 0.01 0.35 0.14 1666 CC: To Pitch (cc)... 0.001 30 15 no 0.03 0.45 0.01 0.35 0.14 1666 ESRPD: pda input_file -o output_file -P -d 1 -shift 0.001 -length 0.0384 -fmax 1666 -fmin 30 -n 0 -m 0 SHS: To Pitch (shs)... 0.001 30 15 5000 15 0.84 1666 48 SHR: [ t, p ] = shrp( x, fs, [30 1666], 40, 1, 0.4, 5000, 0, 0 ); SWIPE: [ p, t ] = swipe( x, fs, [30 1666], 0.001, 1/96, 0.1, -Inf ); SWIPE : [ p, t ] = swipep( x, fs, [30 1666], 0.001, 1/96, 0.1, -Inf ); YIN: p.minf0 = 30; p.maxf0 = 1666; p.hop = 10; p.sr = 10000; r = yin(x,p);

PAGE 109

109 Besides the widening of the pi tch range, the only difference w ith respect to the commands used for the speech databases were for ESRPD a nd SHS. For both of them, the low-pass filtering was removed in order to use as much informa tion from the spectrum as possible. This was convenient because the sounds were already low-pa ss filtered at 5 kHz, a nd therefore the highest pitch sounds (around 1666 Hz) had no more than th ree harmonics in the spectrum. The second change was the use of the ESRPD peak-tracker ( option -P) as an attempt to make the algorithm improve upon its results with speech. The evaluation process was very similar to th e one followed for speech: the time instants of the ground truth and the pitch estimates were rounded to the closest millisecond, the intersection of all the times was taken, and the stat istics were computed only at the times of this intersection. However, there was an issue that was necessary consider in this database. Some instruments played much longer notes than others The range of durations goes from tenths of second for strings playing in pizzicato to several seconds for some notes of the piano. If the overall error is computed without taking this into account, the results will be highly biased toward the performance produced with the in struments that play the largest notes. To account for this, the GER was computed independently for each sample, and then averaged over all the samples. However, this introduced an undesired effect: some samples had very few pitch estimates (only one estimate in so me cases), and therefor e this procedure would give them too much weight, whic h potentially would introduce noi se in our results. Therefore, we discarded the samples for which the time instants at which the algorithms were evaluated were less than half the durati on of the sample (in milliseconds). This discarded 164 samples, resulting in a total of 3459 samples, which wa s nevertheless a significant amount of data to quantify the performan ce of the algorithms.

PAGE 110

110 APPENDIX C GROUND TRUTH PITCH FOR THE DISORDERED VOICE DATABASE Table C-1. Ground truth pitch values for the disordered voice database AAK02 220.0 AAS16 123.5 ABB09 246.9 AB G04 116.5 ACG13 207.7 ACG20 164.8 ACH16 185.0 ADM14 138.6 ADP02 155.6 ADP11 116.5 AEA03 220.0 AFR17 246.9 AHK02 110.0 AHS20 196.0 AJF12 110.0 AJ M05 138.6 AJM29 123.5 AJP25 233.1 ALB18 123.5 ALW27 174.6 ALW28 220.0 AMB22 146.8 AMC14 92.5 AMC16 146.8 AMC23 196.0 AMD07 130.8 AMJ23 123.5 AMK25 77.8 AMP12 220.0 AMT11 246.9 AMV23 185.0 ANA15 155.6 ANA20 155.6 ANB28 196.0 AOS21 110.0 ASK21 116.5 ASR20 92.5 ASR23 130.8 AWE04 155. 6 AXD11 174.6 AXD19 196.0 AXL04 196.0 AXL22 196.0 AXS08 155.6 AXT11 185.0 AXT13 196.0 BAH13 98.0 BAS19 293.7 BAT19 185.0 BBR24 164.8 BCM08 233.1 BEF05 185.0 BGS05 246.9 BJH05 174.6 BJK16 174.6 BJK29 103.8 BKB13 87.3 BLB03 110.0 BMK05 246.9 BMM09 233.1 BPF03 116.5 BRT18 311.1 BSD30 130.8 BS G13 174.6 BXD17 138.6 CAC10 185.0 CAH02 196.0 CAK25 196.0 CAL12 92.5 CAL28 261.6 CAR10 196.0 CBD17 164.8 CBD19 174.6 CBD21 207.7 CBR29 174.6 CCM15 110.0 CDW03 146.8 CEN21 92.5 CER16 185.0 CER30 174.6 CFW04 155.6 CJB27 116.5 CJP10 98.0 CLE29 116.5 CLS31 185.0 CMA06 123.5 CMA22 103.8 CMR01 185.0 CMR06 110.0 CMR26 174.6 CMS10 196.0 CMS25 185.0 CNP07 196.0 CNR01 185.0 CPK19 155.6 CPK21 174.6 CPW28 220.0 CRM12 185.0 CSJ16 233.1 CSY01 110.0 CTB30 146.8 CTY03 130.8 CXL08 174.6 CXM07 130.8 CXM14 220.0 CXM18 146.8 CXP02 207.7 CXR13 146.8 CXT08 155.6 DAC26 155.6 DAG01 185. 0 DAM08 174.6 DAP17 130.8 DAS10 146.8 DAS24 146.8 DAS30 87.3 DAS40 77.8 DB A02 220.0 DBF18 155.6 DBG14 103.8 DFB09 233.1 DFS23 293.7 DFS24 293. 7 DGL30 207.7 DGO03 110.0 DHD08 123.5 DJF23 146.8 DJM14 130.8 DJM28 185.0 DJP04 110.0 DLB25 261.6 DLL25 174.6 DLT09 207.7 DLW04 130.8 DMC03 185.0 DMF11 293.7 DMG07 146.8 DMG24 196.0 DMG27 155.6 DMP04 123.5 DMR27 233.1 DMS01 146.8 DOA27 92.5 DRC15 196.0 DRG19 116.5 DSC25 277.2 DSW14 138. 6 DVD19 164.8 DWK04 130.8 DXS20 123.5 EAB27 164.8 EAL06 207.7 EAS11 110.0 EAS15 138.6 EAW21 207.7 EBJ03 146.8 EDG19 196.0 EEB24 164.8 EEC04 196.0 EED 07 554.4 EFC08 130.8 EGK30 196.0 EGT03 138.6 EGW23 220.0 EJB01 92.5 EJM04 123.5 ELL04 116.5 EMD08 82.4 EML18 370.0 EMP27 174.6 EOW04 164.8 EPW04 164.8 EPW07 123.5 ERS07 185.0 ESL28 207.7 ESM05 138.6 ESP04 138.6 ESS05 174.6 ESS24 220.0 EWW05 174.6 EXE06 146.8 EXH21 185.0 EXI04 110.0 EXI05 116.5 EXS07 207.7 EXW12 164.8 FAH01 164.8 FGR15 130.8 FJL23 116.5 FLL27 207.7 FLW13 207.7 FMC08 196.0 FMM21 207.7 FMM29 207.7 FMQ20 155.6 FMR17 116.5 FRH18 146.8 FSP13 155.6 FXC12 110.0 FXE24 196.0 FXI23 103.8 GCU31 123.5 GEA24 130.8 GEK02 138.6 GJW09 174.6 GLB01 77.8 GLB22 98.0 GMM06 196.0 GMM07 207.7 GMS03 110.0 GMS05 261.6 GMW18 146.8 GRS20 110.0 GSB11 164.8 GSL04 116.5 GTN21 130.8 GXL21 196.0 GXT10 155.6 GXX13 164.8 HB S12 196.0 HED26 123.5 HJH07 130.8 HLC16 110.0 HLK01 116.5 HLK15 130.8 HLM24 138.6 HMG03 185.0 HML26 207.7 HWR04 164.8 HXB20 196.0 HXI29 82.4 HXL58 116.5 HXR23 116.5 IGD08 196.0 IGD16 174.6 JAB08 130.8 JAB30 164.8 JAF15 146.8 JAJ10 207.7 JAJ22 155.6 JAJ31 155.6 JAL05 174.6 JAM01 207.7 9-Jan 130.8 JAP02 138.6 JAP17 174.6 JAP25 174.6 JBP14 98.0 JBR26 110.0 JBS17 82.4 JBW14 130.8 JCC08 164.8 JCC10 207.7 JCH13 110.0 JCH21 116.5 JCL12 174.6 JCL20 146.8 JCR01 233.1 JDM04 110.0 JEG29 246.9 JES29 123.5 JFC28 82.4 JFG08 138.6 JFG26 138.6 JFM24 174.6 JFN11 110.0 JFN21 116.5 JHW29 146.8 JIJ30 146.8 JJD06 174.6 JJD11 185.0 JJD29 138.6 JJI03 110.0 JJ M28 220.0 JLC08 185.0 JLD24 233.1 JLH03 174.6 JLM18 207.7 JLM27 123.5 JLS11 130.8 JLS18 138.6 JMC18 138.6 JME23 164.8 JMH22 155.6 JMJ04 207.7 JMZ16 196.0 JOP07 130.8 JPB07 98.0 JPB17 164.8 JPB30 98.0 JPM25 110.0 JPP27 207.7 JRF30 123.5 JRP20 110.0 JSG18 207.7 JTM05 87.3 JTS02 103.8 JWE23 185.0 JWK27 98.0 JWM15 116.5 JXB16 110.0 JXB26 116.5 JXC21 220.0 JXD01 138.6 JXD08 138.6 JXD30 123.5 JXF11 246.9 JXF29 103.8 JXG05 138.6 JXM30 146.8 JXS09 110.0 JXS14 146.8 JXS23 98.0 JXS39 146.8 JXZ11 123.5 KAB03 185.0 KAC07 246.9 KAO09 261.6 KAS09 233.1 KAS14 220.0 KCG23 246.9 KC G25 220.0 KDB23 220.0 KEP27 87.3

PAGE 111

111 Table C-1. Continued KEW22 220.0 KGM22 220.0 KJB19 164.8 KJI23 138.6 KJI24 130.8 KJL11 116.5 KJM08 130.8 KJS28 207.7 KJW07 103.8 KLC06 207.7 KLD26 164.8 KMC19 207.7 KMC22 207.7 KMC27 207.7 KMS29 155.6 KMW05 311.1 KPS25 103.8 KTJ26 220.0 KWD22 185.0 KXA21 164.8 KXB17 246.9 KXH19 246.9 LAC02 164.8 LAD13 130.8 LAI04 174.6 LAP05 116.5 LAR05 116.5 LBA24 220.0 LCW30 196.0 LDJ11 82.4 LGK25 110.0 LGM01 185.0 LHL08 207.7 LJH06 207.7 LJM24 196.0 LJS31 220.0 LLM22 277.2 LMB18 116.5 LMM04 185.0 L MM17 196.0 LMP12 196.0 LNC11 98.0 LPN14 146.8 LRD21 116.5 LRM03 293.7 LSB18 174.6 LVD28 261.6 LWR18 220.0 LXC01 207.7 LXC11 207.7 LXC28 207.7 LXD22 207.7 LXG17 116.5 LXR15 103.8 LXS05 196.0 MAB06 196.0 MAB11 146.8 MAC03 185.0 MAM08 207.7 MAM21 220.0 MAT26 261.6 MAT28 233.1 MBM05 155.6 MBM21 196.0 MBM25 185.0 MCA07 164.8 MCB20 174.6 MCW14 277.2 MCW21 196.0 MEC06 196.0 MEC28 174.6 MEH26 196.0 MEW15 246.9 MFC20 123.5 MGM28 220.0 MGV01 103.8 MHL19 138.6 MID08 174.6 MJL02 130.8 MJM04 207.7 MJZ 18 196.0 MKL31 123.5 ML B16 196.0 MLC08 233.1 MLC23 174.6 MLF13 196.0 MLG10 233.1 MMD01 233.1 MMD15 233.1 MMG27 246.9 MMM12 246.9 MMR01 138.6 MMS29 130.8 MNH04 207.7 MNH14 261.6 MPB23 103.8 MPC21 207.7 MPF25 110.0 MPH12 220.0 MPS09 246.9 MPS21 233.1 MPS23 311.1 MPS26 220.0 MRB11 98.0 MRB25 98.0 MRB30 92.5 MRC20 174.6 MRM16 155.6 MRR22 174.6 MSM20 77.8 MWD28 110.0 MXC10 233.1 MXN24 233.1 MXS06 246.9 MXS10 233.1 MYW04 220.0 MYW14 207.7 NAC21 98.0 NAP26 92.5 NFG08 207.7 NGA16 116.5 NJS06 207.7 NLC08 185.0 NM B28 185.0 NMC22 233.1 NMF04 164.8 NML15 196.0 NMR29 123.5 NMV07 207.7 NXM18 185.0 NXR08 185.0 OAB28 69.3 ORS18 98.0 OWH04 233.1 OWP02 246.9 PAM01 92.5 PAT10 110.0 PCL24 110.0 PDO11 110.0 PEE09 185.0 PFM03 103.8 PG B16 110.0 PJM12 98.0 PLW14 207.7 PMC26 92.5 PMD25 130.8 PMF03 233.1 PSA21 155.6 PTO18 98.0 PTO22 98.0 PTS01 130.8 RAB08 185.0 RAB22 196.0 RAE12 110.0 RAM30 261.6 RAN30 261.6 RBC09 155.6 RBD03 155.6 RCC11 233.1 REC19 233.1 REW16 110.0 RFC19 233.1 RFC28 116.5 RFH18 155.6 RFH19 130.8 RGE19 82.4 RHG07 220.0 RHP12 196.0 RJC24 98.0 RJF14 164.8 RJF22 174.6 RJL28 92.5 RJR15 110.0 RJR29 116.5 RJZ16 185.0 RLM21 123.5 RMB07 98.0 RMC07 155.6 RMC18 196.0 RMF14 196.0 RML13 233.1 RMM13 246.9 RPC14 174.6 RPJ15 116.5 RPQ20 103.8 RSM20 130.8 RTH15 87.3 RTL17 87.3 RWC23 98.0 RWF06 146.8 RWR14 110.0 RWR16 116.5 RXG29 98.0 RXM15 110.0 RXP02 138.6 RXS13 130.8 SAC10 103.8 SAE01 164.8 SAM25 138.6 SAR14 207.7 SAV18 277.2 SBF11 207.7 SBF24 207.7 SCC15 138.6 SCH15 207.7 SEC02 196.0 SEF10 98.0 SEG18 130.8 SEH26 174.6 SEH28 246.9 SEK06 164.8 SEM27 116.5 SFD17 116.5 SFD23 87.3 SFM22 92.5 SGN18 138.6 SHC07 164.8 SHD17 220.0 SHT20 138.6 SJD28 123.5 SLC23 220.0 SLG05 196.0 SLM27 87.3 SMD22 207.7 SMK04 370.0 SMK23 146.8 SMW17 77.8 SPM26 92.5 SRB31 174.6 SRR24 130.8 SWB14 123.5 SWS04 155.6 SXC02 146.8 SXG23 174.6 SXH10 185.0 SXM27 196.0 SXS16 220.0 SXZ01 87.3 TAB21 174.6 TAC22 207.7 TAR18 155.6 TCD26 138.6 TES03 220.0 TLP13 233.1 TLS08 185.0 TMK04 261.6 TNC14 207.7 TPM04 155.6 TPP11 220.0 TPP24 185.0 TPS16 116.5 TRF06 116.5 TRF21 98.0 TRS28 185.0 VFM11 220.0 VJV02 130.8 VJV09 110.0 VMB18 174.6 VMS04 277.2 VMS05 246.9 VRS01 164.8 WBR12 277.2 WCB24 174.6 WDK04 110.0 WDK13 220.0 WDK17 130.8 WDK47 146.8 WFC07 116.5 WJB06 233.1 WJB12 110.0 WJF15 174.6 WJP20 123.5 WPB30 123.5 WPK11 110.0 WSB06 110.0 WST20 87.3 WTG07 130.8 WXE04 123.5 WXH02 103.8 WXS21 110.0 LME07 659.3 EAM05 146.8 JEC18 196.0 TMD12 349.2 SMA08 220. 0 SHD04 349.2 KXH30 174.6 VAW07 174.6

PAGE 112

112 REFERENCES American Standards Association (1960). “Acoustical Terminology SI 1-1960” (American Standards Association, New York). American National Standards Institute (1994). ANSI S1.1-1994, ‘‘American National Standard Acoustical Terminology’’ (Acoustical Society of America, New York). Askenfelt, A. (1979). “Automatic notation of played musi c: the VISA project," Fontes Artis Musicae 26, 109-118. Bagshaw, P. C., Hiller, S. M., and Jack, M. A. (1993). “Enhanced pitch tracking and the processing of F 0 contours for computer and intonation teaching,” Proc. European Conf. on Speech Comm. (Eurospeech), pp. 1003–1006. Bagshaw, P. C., (1994), “Automatic prosodic analysis for computer aided pronunciation teaching”, doctoral dissertation. Boersma, P. (1993). “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, ” Proc. Institute of Phonetic Sciences 17, 97110. Dannenberg, R. B., Birmingham, W. P., Tzanetakis, G. P., Meek, C. P., Hu, N. P., and Pardo, B., P. (2004). “The MUSART Testbed for Query-byHumming Evaluation,” Comp. Mus. J. 28, 34-48. de Boer, E. (1976). “On the ‘residue’ and auditory pitch percepti on," in Handbook of Sensory Physiology, edited by W. D. Keidel and W. D. Neff (Springer-Verlag, New York), Vol. V/3, 479–583. De Bot, K. (1983). “Visual feedback of intonation I: Effectiveness and induced practice behavior,” Language and Speech 26, 331-350. De Cheveign, A., Kawahara, H. (2002). “YIN, a fundamental frequency estimator for speech and music,” J. Acoust. Soc. Am. 111, 1917-1930. Di Martino, J., Laprie, Y. (1999): “An efficient F 0 determination algorithm based on the implicit calculation of the autocorrelation of the tem poral excitation signal, ” Proc. EUROSPEECH, 2773-2776. Doughty, J., and Garner, W. (1947). “Pitch characteristics of shor t tones. I. Two kinds of pitch threshold,” J. Exp. Psychol. 37, 351–365 Duifhuis, H., Willems, L. F., and Sluyter, R. J. (1982). “Measurement of pitch in speech: an implementation of Goldstein's theory of pitch perception,” J. Acoust. Soc. Am. 71, 15681580.

PAGE 113

113 Fant, G. (1960). Acoustic theory of speech production, with calculations based on X-ray studies of Russian articulations (Mouton De Gruyter). Fastl, H., and Stoll G. (1979). “Scaling of pitch strength,” Hear. Res. 1, 293-301. Fastl, H., and Zwicker, E. (2007). Psychoacoustics: Facts and Models (Springer, Berlin). Galembo, A., Askenfelt, A., Cuddy, L. L., and Russo, F. A. (2001). “Effects of relative phases on pitch and timbre in the piano ba ss range,” J. Acoust. Soc. Am. 110, 1649–1666. Glasberg, B. R., and Moore, B. C. J. (1990). “Derivation of auditory filter shapes from notchednoise data,” Hear. Res. 47, 103-138. Hermes, D. J. (1988). “Measurement of pitch by subharmonic summation,” J. Acoust. Soc. Am. 83, 257-264. Houtgast, T. (1976). "Subharmonic pitches of a pure tone at low S/N ratio," J. Acoust. Soc. Am. 60, 405–409. Houtsma, A. J. M., and Smurzynski, J. (1990). "Pitch identification and discrimination for complex tones with many harmonics," J. Acoust. Soc. Am. 87, 304–310. Kawahara, H., Katayose, H., de Cheveign, A., and Patterson, R. D. (1999). ‘‘Fixed Point Analysis of Frequency to Instantaneous Fre quency Mapping for Accurate Estimation of F0 and Periodicity,’’ Proc. EUROSPEECH 6, 2781–2784. Medan, Y., Yair, E., and Chazan, D. (1991). ‘‘Super resolution pitch determination of speech signals,’’ IEEE Trans. Acoust., Speech, Signal Process. 39, 40–48. Moore, B. C. J. (1977). ‘‘Effects of relative phase of the components on the pitch of threecomponent complex tones,’’ in Psychophysics and Physiology of Hearing edited by E. F. Evans and J. P. Wilson (Academic, London). Moore, B. C. J. (1986). “Parallels between frequency sele ctivity measured psychophysically and in cochlear mechanics,” Scand. Audiol. Supplement 25, 139-152. Moore, B. C. J. (1997). An Introduction to the Psychology of Hearing (Academic, London). Murray, I. R., and Arnott, J., L. (1993). “Toward the simulation of emotion in synthetic speech: A review of the literatu re of human vocal emotion,” J. Acoust. Soc. Am. 93, 1097-1108. Noll, A. M. (1967). “Cepstrum pitch determination,” J. Acoust. Soc. Am. 41, 293-309. Oppenheim, A. V., Schafer, R. W. and Buck, J. R. (1999). Discrete-Time Signal Processing (Prentice Hall, New Jersey). Patel, A. D., and Balaban, E. (2001). “Human pitch perception is reflected in the timing of stimulus-related cortical activity,” Nature Neuroscience 4, 839-844.

PAGE 114

114 Plante, F., Meyer, G., and Ainsworth, W. A. (1995). “A pitch extraction reference database," EUROSPEECH-1995, 837--840. Rabiner, L.R. (1977), “On the Use of Autocorrelation Analys is for Pitch Detection,” IEEE Trans. Acoust., Speech, Signal Process. 25, 24-33. Robinson, K.L. and Patterson, R.D. (1995a) “The duration required to identify the instrument, the octave, or the pitch-chroma of a musical note,” Music Perception 13, 1-15. Robinson, K.L. and Patterson, R.D. (1995b) "The stimulus duration re quired to identify vowels, their octave, and their pitch-chroma," J. Acoust. Soc. Am. 98, 1858-1865. Schroeder, M. R. (1968). “Period histogram and product spec trum: new methods for fundamental frequency measurement,” J. Acoust. Soc. Am. 43, 829-834. Schwartz, D. A., and Purves, D. (2004). “Pitch is determined by naturally occurring periodic sources,” Hear. Res. 194, 31-46. Secrest, B., and Doddington, G. (1983) “An integrated pitch tracking algorithm for speech systems,” Proc. ICASSP-83, pp. 1352-1355. Sethares, W. A. (1998). Tuning, Timbre, Spectrum, Scale (Springer, London). Shackleton, T. M. and Carlyon, R. P. (1994). "The role of resolved and unresolved harmonics in pitch perception and frequency modulation discrimination," J. Acoust. Soc. Am. 95, 35293540. Shofner, W. P., and Selas, G. (2002), “Pitch strength and Stevens’s power law,” Percept Psychophys. 64, 437-450. Sondhi, M. M. (1968), “New Methods of Pitch Extr action,” IEEE Trans. Audio and Electroacoustics AU-16, 262-266. Spanias, A. S. (1994). “Speech coding: a tutorial review,” Proc. IEEE, 82, 1541-1582. Stevens, S.S. (1935), “The relation of pitch to intensity,” J. Acoust. Soc. Am. 6, 150–154. Sun, X. (2000). “A pitch determination algorithm ba sed on subharmonic-to-harmonic ratio,” Proc. Int. Conf. Spoken Language Process. 4, 676-679. Takeshima, H., Suzuki, Y., Ozawa, K., Kumagai, M., and Sone, T. (2003). ‘‘Comparison of loudness functions suitable for drawing equal-loudness level c ontours,’’ Acoust. Sci. Tech. 24, 61–68. Verschuure, J., van Meeteren A.A. (1975). “The effect of intensity on pitch,” Acustica 32, 33– 44. von Helmholtz, H. (1863). On the Sensations of Tone as a Ph ysiological Basis for the Theory of Music (Kessinger Publishing).

PAGE 115

115 Wang, M. and Lin, M. (2004). “An Analysis of Pitch in Chinese Spontaneous Speech,” Int. Symp. on Tonal Aspects of Tone Languages, Beijing, China. Wiegrebe, L., and Patterson, R. D. (1998). “Temporal dynamics of pitch strength in regular interval noises,” J. Acoust. Soc. Am. 104, 2307-2313. Huang, X. Acero, A., and Hon, H. W. (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (Prentice Hall, New Jersey). Yost, W. A. (1996). “Pitch strength of iterated rippl ed noise,” J. Acoust. Soc. Am. 100, 33293335.

PAGE 116

116 BIOGRAPHICAL SKETCH Arturo Camacho was born in San Jose, Co sta Rica, on October 21, 1972. He did his elementary school at Centro Educativo Roberto Cantillano Vindas and his high school at Liceo Salvador Umaa Castro. After th at, he studied Music at the Un iversidad Nacional, and at the same time he performed as pianis t in some of the most popular Co sta Rican Latin music bands in the 1990’s. He also studied Computer and Informa tion Science at the Univ ersidad de Costa Rica, where he obtained his B.S. degree in 2001. He worked for a short time as a software engineer in Banco Central de Costa Rica during that year, but soon he moved to the United States to pursue graduate studies in Computer Engineering at the University of Florida. He received his M.S. and Ph.D. degrees in 2003 and 2007, respectively. Arturo’s research interests span all areas of automatic music analysis, from the lowest level tasks like pitch estimation and timbre identification, to the highest levels tasks like analysis of harmony and gender. His dream is to have one da y a computer program that allows him (and everyone) to analyze music as well or bett er than a well-traine d musician would do. Currently, Arturo lives happily with his loved wife Alexandra, who is another Ph. D. gator in Computer Engineering and who he married in 2002, and their love d daughter Melissa, who was born in 2006.


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E20101115_AAAACM INGEST_TIME 2010-11-15T12:18:59Z PACKAGE UFE0021589_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
FILE SIZE 1053954 DFID F20101115_AAAZXM ORIGIN DEPOSITOR PATH camacho_a_Page_099.tif GLOBAL false PRESERVATION BIT MESSAGE_DIGEST ALGORITHM MD5
48dda1155b3d957c2435019a4ebbd677
SHA-1
db17da6a52e66f1e9b844a9d6404b6714567f65a
5575 F20101115_AAAZWY camacho_a_Page_077thm.jpg
6f34cb2203255fd8506cd7079fc2e0dc
622726344bdacf063b6117a63c29f00aac946d5f
38236 F20101115_AAAZYB camacho_a_Page_007.jpg
5fa3ef7f7debb78d523b9f808f08c487
6749dae5f7890117d744dd98c3b08d3a51cc2a81
77439 F20101115_AAAZXN camacho_a_Page_082.jpg
cf2f12d1d790e168b92ada951ee6ed43
9c121f3ee642559973df17a3c75f0d095dce6594
71500 F20101115_AAAZWZ camacho_a_Page_097.jpg
ad73bdfb9f66a4c5365a31bb0fe56144
f519dbb54c60913dfe67fd003fd61adf4715ecfe
36557 F20101115_AAAZYC camacho_a_Page_011.jpg
af0094be82c0575272b0115219351c67
51c44e251e0d4752be2758abdf9ee726495c2e5a
75162 F20101115_AAAZXO camacho_a_Page_015.jpg
779491209aa7109798d34ced142508bb
2e37f72eef4f2369931c79d46495de7af9d64ece
49833 F20101115_AAAZYD camacho_a_Page_012.jpg
818ea00273d051c1ae4b3060f2bb9d18
a14857d7a5035a83ddfd0e50d212c7dcaf03a03a
7309 F20101115_AAAZXP camacho_a_Page_072thm.jpg
152433199fe2303260f05b28705d6640
bea1891de1c3a5eeaafd2ffebd2e6954a068737e
69824 F20101115_AAAZYE camacho_a_Page_014.jpg
ad8e2000782267418f4b16dc00fe41db
485adcbaf00248d1859b920d740a09e6f6c1c56d
2240 F20101115_AAAZXQ camacho_a_Page_008.txt
7ffb4d55fccbe1dcd64027e3ab4b99b0
23fe2074db987809512a8987e81c177404ec33b5
83636 F20101115_AABAAA camacho_a_Page_108.jpg
cdeb469bafb30e14f2c7bc95a3c713d1
d4ef1f308cb89cbf10df15738fe523f0fff7519d
72916 F20101115_AAAZYF camacho_a_Page_017.jpg
163a9a4a49fff6c0cb8fc28a249a68f7
4e3222bd30f7b1e1a80fdbec0d73a8638bd55965
114412 F20101115_AAAZXR camacho_a_Page_067.jp2
8772219d55bfc9bfb44347cfa18f53b3
4c932c71b5892ee6021e8407de8b7d2d3db2d022
78134 F20101115_AABAAB camacho_a_Page_109.jpg
3b3fc3d2bd697665c0ed827a6ff5df28
c1bba2771bbe07489e1a94d28fdf10dd88c27c2b
59114 F20101115_AAAZYG camacho_a_Page_018.jpg
2b3c6501c96993b0064afae923e3f1a2
50d86b77dc32d32807a4c4099c2a74b274c25f0b
F20101115_AAAZXS camacho_a_Page_106.tif
e47f47a4601a79e7b08c4a1ac5264813
974e748e6ff9b8f2b0740f17df50f71d028807a0
117029 F20101115_AABAAC camacho_a_Page_110.jpg
75c5c948c69ef95a4b933dfcba4e331f
51cb63a2c65b778eea566f05f71b9c253c0b2f4d
F20101115_AAAZXT camacho_a_Page_100.tif
d2bd53e025214115f5f1beeb715dbf9b
b19fbd869368073d838ad2549da716237eabbabe
103793 F20101115_AABAAD camacho_a_Page_111.jpg
289ff4752793015600923bfc752b42bf
f663c044a67af0ed77491aa0991af77a6818c523
78521 F20101115_AAAZYH camacho_a_Page_021.jpg
df28d0d8fddbbd38d54d063103e51b46
e2ed0e16ab2e96e204b080c0906ee2f52350ea9b
135166 F20101115_AAAZXU UFE0021589_00001.mets FULL
77b50e2a0e54b943b2aaefacdef1dbc9
e16caae5a5d73956e53d98d021b4fc7a0ba48f6a
74583 F20101115_AABAAE camacho_a_Page_112.jpg
e26ec8c356597126d9a8f8fb1bd97a1c
4cf66d79b29781ad86593a976e1805760e4e61ea
52756 F20101115_AAAZYI camacho_a_Page_022.jpg
ccc9e16ad243893ddea2a786d8f0341a
bf2e442f0505ffeece19191c1d34369f0ac78c83
89157 F20101115_AABAAF camacho_a_Page_113.jpg
d288fe581155d99b126156b6c705e8b6
d2fbffa7bca696aca38478f9947bb4c1b8c848fc
45319 F20101115_AAAZYJ camacho_a_Page_031.jpg
569b0ba63c7ab0798acd9a8fdb9f1198
b19d3cc4e30606e3a754c17fe4d0bb03312906ba
5032 F20101115_AABAAG camacho_a_Page_002.jp2
960d8a21abb58d9458fd4648e6e74b60
fe2fb4e27b802e371e4234bff3329890fb7f32ac
74500 F20101115_AAAZYK camacho_a_Page_032.jpg
62fdbf3b35f90de72e98b4f33e0d5164
e0b193cb2df78bbc603c551438d230114bc52a23
9912 F20101115_AAAZXX camacho_a_Page_002.jpg
01a707bb4130e0b16f08fcf1e4b4a0a3
29c4fcc74055db73d177548db88c4e7422574b9a
34622 F20101115_AABAAH camacho_a_Page_004.jp2
de49e004115150847cadc93863d20fbc
704fe269b0ed38578c022ffad53363ff7c40f348
41210 F20101115_AAAZZA camacho_a_Page_060.jpg
0c6bcb5f1c6d840673a3117e686ae8f1
b310194dd4220da75102d12231e83416b2dd37be
51286 F20101115_AAAZYL camacho_a_Page_033.jpg
d4ae249c147b7bafdd07962ada55c5ae
023f30c9fc1ab0d8b391160a797a65e9b38c65d4
10742 F20101115_AAAZXY camacho_a_Page_003.jpg
5b06b095bdd0c165bc9e33c6913c1691
e3dbe01ff2e672fb01207839dcea88b1b8f7f1fb
1051986 F20101115_AABAAI camacho_a_Page_005.jp2
aa5b445263e42cc5fb35c87ff8b49dd9
af4661f09ccd5a7d89836f04776422665d65f3c5
64922 F20101115_AAAZZB camacho_a_Page_062.jpg
08d35c92f5f11bc222d26c26e383ba5c
db22a2f247482694135e85c4c29b768ae0ccbb65
44145 F20101115_AAAZYM camacho_a_Page_035.jpg
f7addb97853ecd349933f728c82bf754
0e776421cf6b1afae2ff1279f6ebd6e887094d0c
28022 F20101115_AAAZXZ camacho_a_Page_004.jpg
1908114c0704b3d02452b6f1146efe62
e690d313f6b11f04bcf0bf1b3de69ad76eb6179a
60772 F20101115_AAAZZC camacho_a_Page_063.jpg
670b54326feeb324bb0b377e47fcf048
093fa4e50a261807af2e5583e04fae64554b6451
68019 F20101115_AAAZYN camacho_a_Page_036.jpg
fbacce949cf617a48126759204cb80e0
fa9c5225a5b9476f037e130b02c2fd096fd8a58c
932771 F20101115_AABAAJ camacho_a_Page_007.jp2
2841dfdf85181635579a740c7b2ee20d
4a1474afa72767c779132e961b8143f44a1077db
58019 F20101115_AAAZZD camacho_a_Page_064.jpg
820da823c0b9c4dddbef9e7f02de741e
61ae858229f9f3dbc5f36625abe3a5d353364fb6
63953 F20101115_AAAZYO camacho_a_Page_039.jpg
783b546caaa9747a8df949dc0ce4ec59
8d1d52e32da6a92cdb4a20e0a22f6a2870359ef3
1051973 F20101115_AABAAK camacho_a_Page_008.jp2
d40de6633406e4b815553b0f330d2a66
4d38f10d403d418a2ef3dd320a5497a0f7d26b92
70671 F20101115_AAAZZE camacho_a_Page_065.jpg
3ac6fa0e211a78046f13847a74ca3b04
fc55dc5e9146d7c955aa59fe4a0d250967509ecb
70487 F20101115_AAAZYP camacho_a_Page_041.jpg
2a37d4d9b3c3522d87d38550e4e72467
d57efb3a732b902c3d648a9950e263d80e715fb1
40366 F20101115_AABABA camacho_a_Page_030.jp2
60a0a17bd7bde4c941c760c97497c6a2
64381ebec88a2bda8aff0c33c802a547a3421ec4
1051972 F20101115_AABAAL camacho_a_Page_009.jp2
4135c334c672375ee152021e033031ea
8d13282c6aa680cfe80dc7e929be5599b95c8d47
76844 F20101115_AAAZZF camacho_a_Page_066.jpg
94c607a57f260e963afb36f92f064192
c554fd9e5657dbf7bc791d75e824f00d167fc425
65481 F20101115_AAAZYQ camacho_a_Page_042.jpg
01d6a61e69c30641df6cc6a1bac9ccaf
14836b92833306d8a894e3362270b8a0d1274a6d
677883 F20101115_AABABB camacho_a_Page_033.jp2
3ce1fe44e0d88c4d459f04adaf0c4e05
a8900d330ba524796679bc77eccfd36e5250a79d
972867 F20101115_AABAAM camacho_a_Page_010.jp2
86047990a88857a4c45e1da17961c25e
3aeac5a4929e89c1c970ce173cf23c83b735af59
78179 F20101115_AAAZZG camacho_a_Page_067.jpg
4157b30ebf9ec6f065d18ff4b9c2af9d
917abdfff53eae186ae0e7e428d4ce527f260297
47349 F20101115_AAAZYR camacho_a_Page_044.jpg
6d1878abda8bbd7f6cc6df35a276b208
fa1c1148da48345f5275904c4a45ae2875288b51
100198 F20101115_AABABC camacho_a_Page_036.jp2
99b82929a0fd24f57611154906a22cf3
0317bfc239a481b330a7a92e72e8c0d4b098d246
50399 F20101115_AABAAN camacho_a_Page_011.jp2
146f74360a6ac7fa76ccafaf568060ab
a22844f620dbb8bf8d735684648a973e0c7fcd35
73053 F20101115_AAAZZH camacho_a_Page_068.jpg
b8a9fba150a08af2cebe15e4c14082ea
7952b6ebc74d9f2d19e1f74821ece71836151aaa
26927 F20101115_AAAZYS camacho_a_Page_046.jpg
2bf9ed969046c6960973ee4d70812b77
d8d91989f73970ffda7fcfb847e022cf75f14540
806302 F20101115_AABABD camacho_a_Page_037.jp2
41606bf67e529e133bbaf4266159b2d4
3aee0680a767fb351df5ee8e716e0a4dc2046497
69987 F20101115_AABAAO camacho_a_Page_012.jp2
f1df45d75de109994abe1079d39a3311
b826046143026e67e73669068541781d1aae1e73
55054 F20101115_AAAZYT camacho_a_Page_049.jpg
a963cc8d8b198347a2df28c4bdbcfbb3
79f47197e6d2d45eaee012ecf3c7fde3d25f0398
630131 F20101115_AABABE camacho_a_Page_038.jp2
a67c2c1b93753a7cf2a22084a0a3604a
efd977544a31210ef37d6f49387f4669409bc0af
111784 F20101115_AABAAP camacho_a_Page_013.jp2
a3f659df8f74c4710acef1435682305d
4ccbfae2f494d758664612bd8d989fad11310a1d
73469 F20101115_AAAZZI camacho_a_Page_071.jpg
6b144477837a1d416cd9750f5624bfc1
1594d6cffa0262ceb71822e68cd3cda177440081
58880 F20101115_AAAZYU camacho_a_Page_050.jpg
15841213b89dd7020114c6b34eafded9
571b71818b79d123006443577a6a6c67dacb30c2
95601 F20101115_AABABF camacho_a_Page_039.jp2
52e9f53ca0e8ba9a0fa52d3a22d63502
34ca38250ccdd3a8ac68ee982008eb7aed8edc0e
102706 F20101115_AABAAQ camacho_a_Page_014.jp2
9073b5e01bcbaa88351629c302e5e161
7c4dfd45be25c9fc26eb84bad8d1f08e8e08c7cb
76836 F20101115_AAAZZJ camacho_a_Page_074.jpg
fc2d1d795a76ebad0a7cdf36cbc83ecf
58b765c7120c1084ecd90ea45af03ef081364d6f
75866 F20101115_AAAZYV camacho_a_Page_053.jpg
feb873bc4eb0b26b11f7e360b6e5175f
cea6a85c92d989339208a59342748d6df09ced79
785166 F20101115_AABABG camacho_a_Page_040.jp2
cf387046f821584f2404ea8635fce299
5595d8fe7020e77feb9affdc24e9ca000b9f7fd0
111791 F20101115_AABAAR camacho_a_Page_015.jp2
7b6bee47fb780b198f5cf6440ffda33b
b8a6bc239cc61ffea0a135e6d7bff789da951e9e
63190 F20101115_AAAZZK camacho_a_Page_077.jpg
5e62d59c10dd5df94c9cd2cfd1e5414e
35c9f5b1480c8de78073b4cf2ea468120d6da057
64807 F20101115_AAAZYW camacho_a_Page_054.jpg
9f0ba73a4dce1029084414beba9d0979
3c02bd7fe9e8fa680d0a03827c55f04dd2e592e1
89777 F20101115_AABABH camacho_a_Page_043.jp2
549bae9a27b48e0bd7477e50b2251037
0dbcb903f3f3dc24486b508031cd14f2833bf85d
114320 F20101115_AABAAS camacho_a_Page_019.jp2
b2654c9ac01c74f8b9ff9d4020f3dcb7
1a9feaf6f1a03396291cde63d725336839eaa1ae
61737 F20101115_AAAZZL camacho_a_Page_078.jpg
efb3753dfba4d3303bac7276c34a7018
6f7d01d87299b4dc908a901d383a9d686f9a332a
75973 F20101115_AAAZYX camacho_a_Page_056.jpg
ecdf6f567d14754320338752b39377e5
b2819667da152a34cdf92a26d81087007e245476
597943 F20101115_AABABI camacho_a_Page_044.jp2
704ff360bdc101d22333496a7cc417a7
6ee26f9ba0b0e9486f4c538f9196a7da9b16961e
768951 F20101115_AABAAT camacho_a_Page_020.jp2
53a3260a967c84218198193160f9f73b
75501840b0302ba7fa5af325210ef8df268e207a
66992 F20101115_AAAZZM camacho_a_Page_081.jpg
1ed36c5a8f4a74b3780dfb2a9f6467dc
80057ccb685ae8a7dd7f79ea3d0014c4e20b6432
67687 F20101115_AAAZYY camacho_a_Page_057.jpg
13fc38f642c80e0b79229c1063bb9ed3
903af6b374cd7a86f5502a68051fa70d51b4a75b
32907 F20101115_AABABJ camacho_a_Page_046.jp2
f1d42a8282b0674268cee7786a66f32e
b895306d0ba6b081d753866d14cbee96a4422f6f
117218 F20101115_AABAAU camacho_a_Page_021.jp2
d2bbdbee6db76d7efc7852d57f0e36c6
71848a824b2f8bc67a833a8799486bdfabe60ed1
35062 F20101115_AAAZZN camacho_a_Page_084.jpg
720b34570e67e00efe05ef01cb442858
ba00e774a78681b891fb295fcf9c1df082357535
55688 F20101115_AAAZYZ camacho_a_Page_059.jpg
71d71e32a920c056179b2a15ca7b9b98
bcebf6b62daf444eca285b5d58c7f909107f2fb3
717673 F20101115_AABAAV camacho_a_Page_022.jp2
01d2033606b134514c0fa6b6f5305280
5e4a703f36ceb3406656124ed341052507f084f1
47274 F20101115_AAAZZO camacho_a_Page_086.jpg
97958280893126f31e7c4f76192f8fd3
17c86e421172128057ba63f85014336427c4a1d7
85354 F20101115_AABABK camacho_a_Page_047.jp2
d7c979bfdf4742a62c48506a2747d4a0
ff0a0762d9daf523900bda9183f37c1421392f69
710597 F20101115_AABAAW camacho_a_Page_024.jp2
5c0e0d2cf018cf7c7a70fa3ac8ef6bd7
28cabecffa155139da89479189eccf3cc3779780
97243 F20101115_AAAZZP camacho_a_Page_088.jpg
77faf420b496fe89797654781d3a4851
621b799e28cf9d8fc6b8c6a8d8e75250996e0841
613031 F20101115_AABABL camacho_a_Page_048.jp2
6282023219fa72f30457065c5ec18bbf
c37e9008d7717081b2b6fcb5a91ab4d3f86c962a
779217 F20101115_AABAAX camacho_a_Page_025.jp2
be8043249a86fd9be468155054b47b4f
a1416fb8088035c30d71fc25fe6d648f14b075c0
56610 F20101115_AAAZZQ camacho_a_Page_091.jpg
bf5db85ba2432ed5c5e8578c4cf1c8dd
325c325680e1d60393f666caeed59617b60fcc35
115763 F20101115_AABACA camacho_a_Page_074.jp2
ff57886469685cf772ae103028076684
0ab83efa3855ceb06ada62dd5a90af8bd9277caa
700516 F20101115_AABABM camacho_a_Page_049.jp2
05d8aa37bd4c711bf924a61e115498d1
adc5546e0db7790513c43224ddd5c4633afbb151
104594 F20101115_AABAAY camacho_a_Page_027.jp2
71b741a9661481e3b9a21d739770b4ed
d1278a69dd87297d854c2f1f63e4b011cb80dba5
68116 F20101115_AAAZZR camacho_a_Page_092.jpg
82f0582c21d96e71da79c27e800d3ae0
05eae7056e211cc757132e4749f0a5f9cd17f6b2
846480 F20101115_AABACB camacho_a_Page_075.jp2
2ee04f2dc5b66f50edf108786140a4d8
a465406ac785b4562e1a25cd819b4a2add12cd3a
808860 F20101115_AABABN camacho_a_Page_050.jp2
10703d52aa2b8fcb6553857be015f86b
179348a0fc7d2d8b8dfb6bc5af8779ecf46e70f3
598413 F20101115_AABAAZ camacho_a_Page_029.jp2
795c556caeccb960b339929ba3b76689
e63be7f78fdde9cfab16d82f6b97bd80733b4e05
65865 F20101115_AAAZZS camacho_a_Page_093.jpg
a91a0e06f4496372630b38eff355a33c
893a7c85b87b2b3a04f2d97a5e2608fffbdefe8c
829577 F20101115_AABACC camacho_a_Page_078.jp2
67d388e4e9c74cc935d2c1b41963c9d1
e5020da3fa7fbd1935cee220a3fa5622f0edab49
110419 F20101115_AABABO camacho_a_Page_053.jp2
5ae6bb2cfca4f7518abfe231d6b94e08
314391a4e8678d70e9129379085feff615a54c8e
57312 F20101115_AAAZZT camacho_a_Page_098.jpg
8764f9fbabaf9e879e417d30fa2c26e1
333cc834edbe2e82e1cb5d69a75ff4095ab9a674
97392 F20101115_AABACD camacho_a_Page_081.jp2
5347718b6fd26b3bff3aa58dddb8dde6
9c96648c640af9c6cfa5cbe9b296db9758fe5a61
855707 F20101115_AABABP camacho_a_Page_054.jp2
1beb8063440d63c365a79f4e5ae15027
0c491b0145cbfe3344f3b8f03c78339340061f0d
82535 F20101115_AAAZZU camacho_a_Page_099.jpg
6ba72e6869174417a290d0424fae0637
b0217ccd95d9ca316ba6ce569ccbce9db3dafd67
369232 F20101115_AABACE camacho_a_Page_084.jp2
e9f33e2c6d4225b4f16798fb5a386b60
2e52d9a26838cc00e8b4c1d1d99ba1594754faca
112811 F20101115_AABABQ camacho_a_Page_056.jp2
baca0852aa84f7e6f3bbe8b0bf52aaf5
d2e1c4ac9f3a727d4fbf858b4d3cc2624696689f
68441 F20101115_AAAZZV camacho_a_Page_102.jpg
65465acaf97f1d8689028719f03833c3
94d6de8314718fe0c2ca28d9999a3d4056b76874
127708 F20101115_AABACF camacho_a_Page_087.jp2
973fb1aa7b07d4e880a72692ff51502d
07c395f461aa4cfbc8daaf01fd5d31f4e3925428
605034 F20101115_AABABR camacho_a_Page_058.jp2
52fca370ebe187a42bd4e129cf9cbaa1
41125a2271eb1a937b0db8aa85e6b1f614084289
70993 F20101115_AAAZZW camacho_a_Page_103.jpg
3c2a0ed515d8049d64f7758e59a802c0
cd5e549989774194b7de5f0f25a7faa61b4eb9bd
151273 F20101115_AABACG camacho_a_Page_088.jp2
3dbb312e18d036875fa70163e5920669
04b09aefdf7b3a9997ec5914ab21fd14394aa09d
708045 F20101115_AABABS camacho_a_Page_059.jp2
e5b541981a5629e680b71433db519b9e
8185232761a7de35c3520234da729d3df114d043
71322 F20101115_AAAZZX camacho_a_Page_104.jpg
e2ead39af15ce85069925e6bbf1d3115
4fe0d7f52fd46647e920b552f9ece01c8255e42a
111631 F20101115_AABACH camacho_a_Page_089.jp2
2d319c9b0a68fc1cc7a819fd82053a18
bd096bb8fc8c2ede0e06c974e1b0472044874505
422928 F20101115_AABABT camacho_a_Page_061.jp2
3a3fe049bb4099dda17980ebcbbe73e0
e3bc3bdf72e9256e3769c7defaf1300e11e545ac
77687 F20101115_AAAZZY camacho_a_Page_105.jpg
03fc1a410bbecc002f3c432e1a7bef23
d6b97b0834e9720aca21d1404a3c0c6c907f2d9d
80227 F20101115_AABACI camacho_a_Page_091.jp2
94ba5d42cd9999088a53e710635bf47f
6326c86f4d6375fa79231c01ed0497bf3f4202c7
754870 F20101115_AABABU camacho_a_Page_064.jp2
252a1ed52fe9d4b6dd0954f9211632d0
8dfdf3aa02fb5b73b5b569bb1846077ac7e2928f
73319 F20101115_AAAZZZ camacho_a_Page_106.jpg
55e94c1bd74e67450e642439ad445049
569351f67a223eb7c30fb59ca2f674b54d78987e
94892 F20101115_AABACJ camacho_a_Page_092.jp2
153964a9a45d6e120f9c952073ccad7d
3826860c1f25276fcd10de28c8c86f7e3ec7cdd4
105630 F20101115_AABABV camacho_a_Page_065.jp2
764cead5138c8eccdaada09a2688df63
55080426330df89521343ab06b6a8fa5bb3d5e24
869988 F20101115_AABACK camacho_a_Page_093.jp2
5345131c53fd1d09b1f152b918004588
c0afb122879425f1b13721ff49db008329cb405f
997928 F20101115_AABABW camacho_a_Page_066.jp2
e9fcceef5728afdf49791e34fc1b3800
2b4c1bf732a9c8b3293d33f9f7b58a7c988b794c
25271604 F20101115_AABADA camacho_a_Page_010.tif
18cebc698609ade01b7023c176849b79
82f39e09f2cf8d33dbd19710c5ccf127de1300ce
97857 F20101115_AABACL camacho_a_Page_094.jp2
b89f7c28700133cf8129ce24832e3c6d
dc3f1133640b7af0eb2ac505601bfa0bc63c8145
107472 F20101115_AABABX camacho_a_Page_068.jp2
bbbb35bf8c2c7f4e87e7d1e8e1771e6c
54eb062ca1e5ba60584756b1e901fa3d51540481
F20101115_AABADB camacho_a_Page_011.tif
bc2f8c85458b90a3a15fe80741d23de3
334cf403772f3257c731b712bc00ff06b25736be
115264 F20101115_AABACM camacho_a_Page_095.jp2
3b8c9f23289d5ed5c47210cbf91e7168
71078df399f16ec2e22b181400113ea43ebb0673
709683 F20101115_AABABY camacho_a_Page_070.jp2
1564abff5bf51844d369279ec0722694
329095b7db397d94a897e72b3d3e13f9721aa084
F20101115_AABADC camacho_a_Page_013.tif
25b8370fbbc8a1f54b2ee11e0765f6fd
312173f105819ae48613683cf1051d1d7be67ced
105114 F20101115_AABACN camacho_a_Page_097.jp2
2bd4ac75357d1fa285e921510ef3280b
e0d44d7e2f137365496cc0d7604cfb2e16305b3b
106518 F20101115_AABABZ camacho_a_Page_071.jp2
e4747812e318937a7f43230c13174312
b09d668ac3a634535153d08399818cd928feafea
F20101115_AABADD camacho_a_Page_014.tif
6a2011d7d665f50ff0ecbebcbb7d2796
f969c60f528865ea090ef1d889f8613a2534f5ed
105584 F20101115_AABACO camacho_a_Page_104.jp2
815920e1a6794c4d7db937bf46bc1c17
1dca44178e0e2c303f48f4d53594303e6ffbffbe
F20101115_AABADE camacho_a_Page_015.tif
7d341a96352b3679fbce7b985ece5c16
26b9df60078b515fe89d443e541fa966c46416e1
116822 F20101115_AABACP camacho_a_Page_105.jp2
4376ac2e63ea4499b293f84f8b6ef2be
2997a2a043ce8cd51dc431174aa149116ac6e21a
F20101115_AABADF camacho_a_Page_016.tif
8230e7e3ca194bf581460e30fcaba0a1
2d850a3d4f4f7cb8401eda7ce974a655cfb1e662
115944 F20101115_AABACQ camacho_a_Page_109.jp2
03ae20e3df35dc465e51bad7155a9464
272bcf01cc96b59938343b56cfca9c0fc8326ede
F20101115_AABADG camacho_a_Page_022.tif
7924d44b2abd181a4b61fd891dad812c
816b808934c5a27823fd7ab328cf86c8ec39bb46
171889 F20101115_AABACR camacho_a_Page_111.jp2
0d7ae2787f405334aeb208ca0d1408d7
612adcd953d5122d2f178f4fcb4cbd3d76f41c6c
F20101115_AABADH camacho_a_Page_023.tif
162f7eb1f92f808ff067b666190f136b
c5c0db33194b95bb35f7edd9a43ab05ca53a5e2a
117967 F20101115_AABACS camacho_a_Page_112.jp2
d0eeab56bf48d79fcb89cf5d83722ca1
9a4f41fd08f77bbf3b436d834d3df8b4c346de22
F20101115_AABADI camacho_a_Page_024.tif
67734087174ce1f5ee77304921211fd9
29cdde6a02b2d75e94b636e855eb100e6d6e648e
34467 F20101115_AABACT camacho_a_Page_115.jp2
ea139512659fffc4da7c8873c23d2f0d
9a5730f42ffc7e7adf5df2c5ea2226d520a38b79
F20101115_AABADJ camacho_a_Page_025.tif
ca70fd9a91e633e28807f9e04e6de026
d48837266dafc226670fdceb289bfd5a0242a33f
F20101115_AABACU camacho_a_Page_001.tif
eade3a0845fd851395b01614671daff6
657723d50a75b34ea4df9ea4555e57f80dec8c7f
F20101115_AABADK camacho_a_Page_028.tif
9151f23a7d3916bbb744dedb78f784a8
bea04acdfa08622d381a7a1101eb5b66924f5296
F20101115_AABACV camacho_a_Page_003.tif
29daaae0b41f1e199387f6c8d731e14c
294f5acad0f0f7f07a6f274fe0ccd12345f673d4
F20101115_AABADL camacho_a_Page_030.tif
bdcd7bbb82aa22b075028ac6f0bca80c
1b9e089bbdf78c49da263443335b64b299cb1f34
F20101115_AABACW camacho_a_Page_004.tif
896c90f3b6550a4962385ba7f214c761
82f1a833ebe2117156c7d76be0aaa93aa8195316
F20101115_AABACX camacho_a_Page_006.tif
33fa786b1ed1c623d1e3dbbe24e73f29
73a51cc114440ef3c525a007d9ca13f63a55fcdb
F20101115_AABAEA camacho_a_Page_054.tif
b90f69937e18fc4800c0f846ef5a33ce
ace403ead0fc16ec22ea17a0dbb882a12a58ee78
F20101115_AABADM camacho_a_Page_032.tif
089045c645540748cdab6dd6d9510285
54a889a03a08abbb4a5f0eb3136b9a144b0582f7
F20101115_AABACY camacho_a_Page_007.tif
6ef89f5ef9107a0ed960b6771f71c1f5
d9ba63fd11d3b8bcf9b8ec024a2d657f233a134c
F20101115_AABAEB camacho_a_Page_055.tif
29db18cb461166e2f63081a43b693b2a
2a0f748aa083c2e48bb6fc5dce54446d53c42d2b
F20101115_AABADN camacho_a_Page_034.tif
12a320f87d462dea1d47838cd4ce0587
1c61c1644ba5bbbd3b81d0b8e51d5d6a5d191211
F20101115_AABACZ camacho_a_Page_009.tif
ca0f43ce4b85908578ce6dc133421bd1
d29fbd4ead6713bf7d4fb209b1353d1da470e710
F20101115_AABAEC camacho_a_Page_059.tif
42ccf7afb86564902fa28609e1c8d83d
e0cfdb997ffb7ca9ca48c831348277218c9bbf83
F20101115_AABADO camacho_a_Page_035.tif
630ce00e464c8a85e2f7a30e0584e5e4
82a52df10beb68bacf400b88532ec829a9dd24ce
F20101115_AABAED camacho_a_Page_062.tif
8ec33ddfea9ebbf141b6d500745e673b
2ee89f0fe1ee49ce79329a3dcff85fcb8793eb60
F20101115_AABADP camacho_a_Page_036.tif
d42254356c19ad4e8f95f1bff774e771
2480eba4c51d8eebe4ef268cf456ee01720865cd
F20101115_AABAEE camacho_a_Page_063.tif
8305defe0c2690e312b96aabd1d8bc30
323183f7d5b2877d8254df41c7a3fd3ce0cdedbd
F20101115_AABADQ camacho_a_Page_037.tif
e4aaa92b1317b612e5125d8ddf77e40c
2b66acffc9e76610b376253f228eb288b48229e2
F20101115_AABAEF camacho_a_Page_064.tif
d94c2477cb9d8c8b35013165482c02ae
9c575402d55293c559063ebd41504df49864cde8
F20101115_AABADR camacho_a_Page_038.tif
3280a6d807eba9d70d7dc9595b6616c4
e2f0a166906959ed7f400f7536d5599bff7feaff
F20101115_AABAEG camacho_a_Page_068.tif
4525f3b43fe1ccd1cfcbaee3b3dc7744
62b421bf82b503082fbcfe340fd0a063c2141407
F20101115_AABADS camacho_a_Page_039.tif
10f7f9f7ede4259c4d24d1b42129183e
b3a76499a3a2cb1a7247f2aa3c7a91987dadb59d
F20101115_AABAEH camacho_a_Page_070.tif
8f475bd1a007300667fb5bf71b097401
a2ffd13ae5eabb2eec56e492b9c30eaa6ebe25ad
F20101115_AABADT camacho_a_Page_043.tif
3407cc184bcf9688005846e29d1267c0
f94dbcdd3cb8ac5b3474d957acd2922fe741d92d
F20101115_AABAEI camacho_a_Page_071.tif
3add6a845db2b339a9d0a2d66b18be16
1d13fa8cf658cf568b2fdda2bb75a51f0d3d162a
F20101115_AABADU camacho_a_Page_046.tif
f3d7ab2108b8ad0a4a1e1f8cab808822
fa16e498caa485a9b4a21c9eecfb4ee1ce24d34c
F20101115_AABAEJ camacho_a_Page_072.tif
f06215ccdb44ffdca457ca84f7cf8b7f
d9227bb81d3fa51f5287726733947c1dd22096ab
F20101115_AABADV camacho_a_Page_047.tif
b43b1f46783e91d07341262e23436dc9
7bbb770b7b4febfc14310f714988bdf4489bfc96
F20101115_AABAEK camacho_a_Page_073.tif
0168a1448176e984178ca8e969ce2187
93664d9997d44d705f5b75b6ecdc73166fdcb9b0
F20101115_AABADW camacho_a_Page_048.tif
bc0c199cd812af32b0335ba7c8cbfd9f
b084a33fb3833891333b925b880288fad6d1ddfa
F20101115_AABAEL camacho_a_Page_074.tif
a9510474a22ac6cf584cb84c9c68affc
de1734e1134fd9cc5da6a279afe39cf1f1421a09
F20101115_AABADX camacho_a_Page_049.tif
fb1004ad799444ccb215dc9b9a8e7ed8
05e1d6fbef2c3334afd46d502cabd15e5554a488
F20101115_AABAFA camacho_a_Page_103.tif
cb3645146b5f538c2e913ed73bfc373e
5d4d2b3bf117a2d5b2bd15ab08f4b8ff76e25416
F20101115_AABAEM camacho_a_Page_078.tif
a5093fe06889ca8e804bdfed2c864308
612410d214f415b3d248ba6507aa0f3221842bb3
F20101115_AABADY camacho_a_Page_050.tif
d5ceff54f29c92b5efc03f23d8f51632
18739afbcc655afded2422ec77105042c8d7b9fc
F20101115_AABAFB camacho_a_Page_105.tif
67873768679feb4d22da7cd2b7686ed0
efc4bddbdf686ab71415169fdadb4f1e746460bd
F20101115_AABADZ camacho_a_Page_051.tif
8d2eedf5421791e45d56fe15b6560f4d
f86047a00f4ddc95f1da6145308a66dbca724057
F20101115_AABAFC camacho_a_Page_108.tif
6abd9609f8c858da5be2d99481aa0981
0acf917c24a3f69339546fed0f273e1225591e88
F20101115_AABAEN camacho_a_Page_079.tif
8fa57965b50015a8ff513d335edb77b0
b72333c9b5a45319e77eec900bae19eec2b503b2
F20101115_AABAFD camacho_a_Page_109.tif
b3789fd3dc8fc04733d32d64270b8883
1ee8b13c113a02192c33659db46d87897b26e129
F20101115_AABAEO camacho_a_Page_081.tif
cedaf6e3f8a3fd1be83c94846a28cd22
225ec9c783f2375cc4499a5264017203ae3ac734
F20101115_AABAFE camacho_a_Page_110.tif
489cc122dc779d6ebe5f3135c8a953b7
24dafdaa4baabd81d302e12176609e3188b64890
F20101115_AABAEP camacho_a_Page_082.tif
debb43094af0d3921d8f4e7bfac98114
4d0fd47be46e895a5265e76e38c723127f347b08
F20101115_AABAFF camacho_a_Page_113.tif
8ef842f98e68c6e2215f5d8a3a66795e
ca3457324d995a284fe2506052359202d4da7c16
F20101115_AABAEQ camacho_a_Page_083.tif
752150fab9387f44a27b1b64f3a505ac
232a1c776fa0ac33c13bc9a6fd666ee68ab91d54
F20101115_AABAFG camacho_a_Page_114.tif
074f0c8cc6964140f1773d7fe0c0c4c9
297b8f061c539fca30ad6234f56a2503df36ca51
F20101115_AABAER camacho_a_Page_085.tif
92ab167943ad61990cede8ecdf7b69d7
c895d554d09aa395981df01c28d175ba4283d334
F20101115_AABAFH camacho_a_Page_115.tif
82a5b5f06217cf2bdb294da3bc3f9650
a4e5970ed4549eae65e49712e923b2d07c7361e2
F20101115_AABAES camacho_a_Page_086.tif
afe8f797ada37c6b509ebedaea07f009
db6c7afd3c11399043533cb88b08036ba53ba41e
F20101115_AABAFI camacho_a_Page_116.tif
ee4da0dcbc3bd08f50f4636dcde35a6f
ab693009baaaf47d0ad6fb8ed21d9be40c4a36d5
F20101115_AABAET camacho_a_Page_088.tif
a3d222d28f66ae29c7d2bd548db9270b
07a63e6550f906826bb154a1200b2b6baee819f1
1834 F20101115_AABAFJ camacho_a_Page_003.pro
411caf2a05d3147e92b06c09506ae93c
94d424d6d54ad0aa5e1a3df04ad3b5a0bdebcebb
F20101115_AABAEU camacho_a_Page_089.tif
dc62305d2586e1a4668c2d6a5042af8b
98c117b39a5a64a8511b9b4f483703d45073dccb
14127 F20101115_AABAFK camacho_a_Page_004.pro
6f7ae68bf06b664fdb480e6114d35002
22d5cd6be9fcefb87a9672b17828eeb0960946a4
F20101115_AABAEV camacho_a_Page_092.tif
7c91be9e9b111b32296a83d11c5db663
54e9a9e0193a601d956ff6070043edf118df01fa
70978 F20101115_AABAFL camacho_a_Page_005.pro
55a659617a7272784ab350efb1fb9383
f47f97776a837c362bdd95f044352ab9bc52fe61
F20101115_AABAEW camacho_a_Page_094.tif
076bd795fe673b07ce2f22d6133e1901
654a1f6627e535be4a32d6f99829bb69ad148692
29780 F20101115_AABAGA camacho_a_Page_028.pro
367ffe6b87b1010a03a985a8366ebe54
2ce83e8f1b91be5867494ad7f31c43590b205696
77797 F20101115_AABAFM camacho_a_Page_006.pro
9e04ba345b908e7b759972df71b3c7c0
2b5cd6081d4136109878892b8cbcc1f09efa42a9
F20101115_AABAEX camacho_a_Page_097.tif
1919aa28e6a677148387c05ec821ece7
622070384e111d588b051bdce6130ce85f5d90c1
24711 F20101115_AABAGB camacho_a_Page_029.pro
4371e9a358c96dd0148cab3d43880e1c
52a8d188a30aad52851b1535bc52a415ca60069b
27249 F20101115_AABAFN camacho_a_Page_007.pro
a59e406d9e7c21930d0f1b3bf2a64d52
193e745d5fd4fcef776f324b7fe96359ce28b9af
F20101115_AABAEY camacho_a_Page_101.tif
63836e55f3cd28a8d3645691a79e1c78
d6b9a2a8b9b725dca0d6af59c1885e038a5fdf65
16534 F20101115_AABAGC camacho_a_Page_030.pro
37ebb574192907530b631031122c3215
c8e3d50bf5a0dffed8c023803c57565207ba571d
F20101115_AABAEZ camacho_a_Page_102.tif
bfc0c5476154f9c6ef632f7a9bceb4e0
3c6030df3d93f95da14b3e95394fd2423efa9f48
19082 F20101115_AABAGD camacho_a_Page_031.pro
f5e17f14884a47dfe73d750ce0ff9fc4
0e63706f836b3d621093f6e39968cc0a10c0c421
51871 F20101115_AABAFO camacho_a_Page_008.pro
450b781e1b442fa86f48dba47946e1c0
b662715952cc9855f7504b6d2ca8be07df503e8a
52857 F20101115_AABAGE camacho_a_Page_032.pro
3f31d5b761ed0201f441707a3776f865
c93bfd1a7d508f94a5f0d66d73dc86c5c6338509
20198 F20101115_AABAFP camacho_a_Page_011.pro
6f2f7729f77c6fbb2fee9e7518e0207b
be274c91174242273cb88160d77bd461c4c5d661
27696 F20101115_AABAGF camacho_a_Page_033.pro
63e06e5b403fbdc5bc8c26439267c7de
0ff3824973f74774df82ba13f110e6f3126236c6
29492 F20101115_AABAFQ camacho_a_Page_012.pro
3efd344cbd19d2cba579c18b3c31d610
c5701eb6dfb1e70d937ceaad5fda9e90194f7dd8
26816 F20101115_AABAGG camacho_a_Page_034.pro
788f961923143e5171fc4ba964a2b29c
a972c3d46eb1f0210b6edab4db9085da71617732
51558 F20101115_AABAFR camacho_a_Page_013.pro
fb2a91243f4ac7157765d2c0aa7948a2
49c65ea77fbd3c12a70d06479c28f70df92a3a9e
23201 F20101115_AABAGH camacho_a_Page_035.pro
6e2fb0e379ab5e553f72a4bef6aefa1a
a18fe4412c04cab451d8778cdb6d5cd4a317ab50
53177 F20101115_AABAFS camacho_a_Page_015.pro
f9caf4b0dd58c5d90230a6ce37841ffa
be4b743f13c4b2bbd37e5175f692956fe05ce517
45988 F20101115_AABAGI camacho_a_Page_036.pro
b1c923518fff66b34b4e8dcd63141eba
7330505f6fdc10324264c2334551a091b68e7bd7
51167 F20101115_AABAFT camacho_a_Page_016.pro
04abd4124096cf6c06bf1230193be4e6
6fd92a89a3eca041096628ba14ced64fa1aa9751
33308 F20101115_AABAGJ camacho_a_Page_037.pro
b1ee1ae33a75b7769ef6486505b47df1
3b25178dfeac192fad1f2ee6659bf552bfd9f132
32358 F20101115_AABAFU camacho_a_Page_018.pro
07159ce30bba692328115eff57a82887
4a5d0e993d2d16b7c6fef5d7609f8d1d3218abe9
27945 F20101115_AABAGK camacho_a_Page_038.pro
8d77195851f3dca79f6d0bf0c306cae6
a906de0b922b62733981c2911aa352d351c6d2cf
54836 F20101115_AABAFV camacho_a_Page_021.pro
6eb4dddf83f0614d854a9aaecff7c23d
0cef10f7cf1813abc036268316ac9d867cceeb6a
47010 F20101115_AABAGL camacho_a_Page_041.pro
e1ba7b980dd2dc0622f542bd3559b0c0
2b4b9d2b84f827eff43bc66f7375e7a2204c5548
27273 F20101115_AABAFW camacho_a_Page_023.pro
5fe8ade1d5a52ddd5505df9cce20c2fa
38dd714b540d69a1feb969359ce5d17f45ebcfca
41228 F20101115_AABAGM camacho_a_Page_043.pro
9a1ce5981be93225a2ce9815f38f2c31
768ada56478d89292bad44048c0aafa84b5986da
30428 F20101115_AABAFX camacho_a_Page_024.pro
8b1ce011ee6c7d79844079457a84c288
78b5a81539a5df6221115f79d6d01d315c18740d
33389 F20101115_AABAHA camacho_a_Page_066.pro
7a3d04ba92a4723268a4a0c87f2427f4
ea968ff300fb241877c6582285510e014c052dae
21603 F20101115_AABAGN camacho_a_Page_044.pro
ae075ab4101c278fac85c457d47ec87a
cf1252aa2eca6f4580fbb5a044f2e92bfb2779fe
23891 F20101115_AABAFY camacho_a_Page_026.pro
747bb657e38282281d97e5e5cd54dcd0
ae3e01c966742b37f1836adcdfeacf5cf849831c
54249 F20101115_AABAHB camacho_a_Page_067.pro
4ec5d4ea766d4607a6714ba7763ae464
4ce4098a71d1ac3355c36a22fd282f8689b15229
29620 F20101115_AABAGO camacho_a_Page_045.pro
749e4288dc572bcf12e078495b62dd5f
16461f0c62947ba2f6be2f061f2f40a872adecc1
48803 F20101115_AABAFZ camacho_a_Page_027.pro
7f2dff3f20632b7df7e37e259853aaf2
533ee633ed46020e7f4e44e6d2589d70c6f668aa
31090 F20101115_AABAHC camacho_a_Page_070.pro
dd351a46b894f09f7384cdf9b381ee77
2e2be64c6818dfbc7288a9f2918ba3ea741def44
49311 F20101115_AABAHD camacho_a_Page_071.pro
c4a4bb796df4f08188199d77c0f0909d
b45bcf36d9249a303f8df04c160fd75beffa9c98
13374 F20101115_AABAGP camacho_a_Page_046.pro
26b6f132473fb94ae2e878e024b35c08
86aec8c6e6b9436f127755adc2b2f9f12a16ef16
56427 F20101115_AABAHE camacho_a_Page_072.pro
f639c3e05ff0d850d807fa9ded62953b
de5dd3c141f78adaa1ddbc159767ac2662fb5514
30984 F20101115_AABAGQ camacho_a_Page_051.pro
499f3e511a145218d9cba8c43e3be2c4
d5fc584e78f64476bace5fa28ca91c4bcb1a03fd
53938 F20101115_AABAHF camacho_a_Page_074.pro
b1c62db6fbdc70232bb1915fe4e33c53
57d3463681861707851c77aedb2c0135d655b0e4
22837 F20101115_AABAGR camacho_a_Page_052.pro
edd048ea12ee3656db15e04c5f258c5a
fb46d7fb4de6daf638871608723517f91369496e
36986 F20101115_AABAHG camacho_a_Page_075.pro
756f9434aaf0d1d4646cf5da2a5324d0
92740c165e4a5fd0ebcee69f6e5eb15e2a8cb9ee
51732 F20101115_AABAGS camacho_a_Page_053.pro
7cfbb005b6f30dbc7f14593ad4733d83
f49e3966f62d3bd73ca3de256f21318f6c940f05
34435 F20101115_AABAHH camacho_a_Page_076.pro
ac25dddad3cf15fc75bb277bced35bd9
7d59d78f5626eb5c87912a811e0944aef37ddd3f
17968 F20101115_AABAHI camacho_a_Page_077.pro
51b4284952de4d0bf7e5036dba001d69
3e42a997fc2d87927f487e9eda140a50de98fc23
54402 F20101115_AABAGT camacho_a_Page_055.pro
49485ea213586ab1c85baa3fa82463bf
e4f66a49c260c9ab7db0080e07ddb90e6f2fb6e2
26965 F20101115_AABAHJ camacho_a_Page_079.pro
e709758cbaedca51eb01ad0161189e0b
ab9667878b0f98b618faabb5a1e1aae97013aa8c
46764 F20101115_AABAGU camacho_a_Page_057.pro
43176a3c5bf82f024391494a36939c1d
e40393543dfb3d7bf99dcf039e976ee0d11ae2c5
36937 F20101115_AABAHK camacho_a_Page_080.pro
af03eb57bed6360d41fd12bb7d3e804a
dfe13ae36c188becf91c57e05028ea09214b35f1
23936 F20101115_AABAGV camacho_a_Page_058.pro
5ecc191b5c56fb61204254acf8804c1a
586edde047ea24033199f61da93f4b31fab13ddf
41846 F20101115_AABAHL camacho_a_Page_081.pro
9a2e4aa913199b1914cf276c4b58b051
2b987f7b27b5bf3d7514632165848224a1c056f6
13868 F20101115_AABAGW camacho_a_Page_061.pro
a991ecbc8e2fee508d420b43132ecd99
d557dc06aae9d3ac8e89958789523735b17e5bd3
49118 F20101115_AABAIA camacho_a_Page_103.pro
79b428bf08f931ea45f94096330a3b17
0450e0f836d173112517478b8750a06e79a188ab
53022 F20101115_AABAHM camacho_a_Page_082.pro
eaef2bb2497bd3627d4ca4564091887d
3fbc08205056cfaae5300f073d86273337b16097
42875 F20101115_AABAGX camacho_a_Page_062.pro
51dff5eb79b08e3cb6be7fa5df198b08
e4ed6bd5fcea8b91a960fb47563309fb42f7df67
48581 F20101115_AABAIB camacho_a_Page_104.pro
68148c08278181cd62be54c5595298e9
c1f974b0f20ed345585a1354994e29bf755c52e7
23913 F20101115_AABAHN camacho_a_Page_083.pro
d472b1efaf48dff526eb0a508e194e41
935b797ad6c09e8ef74f6cc23eb654d5dcb99a53
39849 F20101115_AABAGY camacho_a_Page_063.pro
9ef95ae789c837a1a8835e2378502f8e
d05b23c3ca03c18f7c8e93b8add27330078be41b
53650 F20101115_AABAIC camacho_a_Page_105.pro
d84e2cc1cdc155654c2fba015aeca819
418bf5f2752673108223a86c8a1f0ef42c7b2ae9
12476 F20101115_AABAHO camacho_a_Page_084.pro
e6de99e0659f9b4d115d44e9b93ef9bd
1cd5cb8024316eef2202c2310491f042b2d5e3bc
48467 F20101115_AABAGZ camacho_a_Page_065.pro
e2105f7f20521b3a084746d6f89c8010
822d21ec22b55e648d3db400f64664ec5b5733ea
52615 F20101115_AABAID camacho_a_Page_106.pro
c636a5f520b7392014988f24a44cfd41
c07cd99e00c8b4f3475fe5af19ea7f372ac7e2c9
34140 F20101115_AABAHP camacho_a_Page_085.pro
e2a27acbe7309ccbd1053aff41ae6348
27128844e6392edfef7b4a0aae2b724dc3e79441
58482 F20101115_AABAIE camacho_a_Page_108.pro
47ecaa9f4769be77809a417db528443e
00abe200f031b59f13eabad55b18e515e9ccb46c
53671 F20101115_AABAIF camacho_a_Page_109.pro
3a1484b0afbb0ebea709b14f2dc9853c
32ab7ed16b8036203cb6f6010639b0a5f6a75c7b
61400 F20101115_AABAHQ camacho_a_Page_087.pro
59bbe0b802201649b416516c36cea6e7
f9e79b71ce24686d73e9dd52880540dc7dbc2288
107367 F20101115_AABAIG camacho_a_Page_110.pro
397d07dc8027cf44b1ee11af2a8fa136
4112bbeacac366cb8ece24a9b0fd8b3eb7e252d7
29481 F20101115_AABAHR camacho_a_Page_090.pro
5046ef9f8de9b63e258aba788b1bc0fb
6e93abb3144927fa3932a143d2213af9d5266d2b
55835 F20101115_AABAIH camacho_a_Page_112.pro
9b466bd640c1f09cf7bcff62f1c86ea2
885e490781e9c029228969a02e54cf576022386a
34345 F20101115_AABAHS camacho_a_Page_091.pro
3d250a0e7b8ccf4fae4907a3f3d5d81d
ffc50e58a41e3543efc6eba864fc8f0a7ff81ce8
36301 F20101115_AABAII camacho_a_Page_116.pro
debb6a992b561e6092a5e9e028ebc9de
4c448405b2b960f9b9ac5cde3231078b3175a835
56432 F20101115_AABAHT camacho_a_Page_095.pro
f83315abde443e6cd7e38c7a688974a3
f952e29b41d077d07f1f66ec4b6ca04922486162
458 F20101115_AABAIJ camacho_a_Page_001.txt
b5d09b692ab5dbe68322ed040e222cd9
aa5e5b628a6416fe45e5777be0ccc8642eee6744
20294 F20101115_AABAHU camacho_a_Page_096.pro
8ee7cae4c3c7894a2fe5a7b2191a491a
cd96f1e70fd34d7135db938810b10bc4325f2a78
3037 F20101115_AABAIK camacho_a_Page_005.txt
6c03134958b7814d6df176ff54ef55b4
f7b15578f064cf0327ea82cbdba7b799555d14c3
48724 F20101115_AABAHV camacho_a_Page_097.pro
054c28cd508fcdfdbaa8f0522af56058
2dd63666cd3348c6451593ea235265f629521a4a
1159 F20101115_AABAIL camacho_a_Page_007.txt
1738ca408d2b7be722d296a1a4cd70d3
65dfa7d4e077c3d8318f04d2efb604c98a42b5d2
37458 F20101115_AABAHW camacho_a_Page_098.pro
63338ffe118d05b2878eed039a31b8b1
07c48c0002ff4b833745e260cf6b5c9e7baa3867
1149 F20101115_AABAIM camacho_a_Page_010.txt
8d4c75c704afb79af50162b58eb88e90
5ba10a0961a6a06da81ee05270ba61e0fc25beea
64729 F20101115_AABAHX camacho_a_Page_099.pro
e6d1f6044db29837341a2a612f314456
48b3f1059d949c0108270eed977a0701f3fe07f1
1353 F20101115_AABAJA camacho_a_Page_034.txt
b6c7d84d2b6981f742ae8bbc3397108a
55fdc54b5b03cd51ce294ba0e9f9d0cdf177085b
2104 F20101115_AABAIN camacho_a_Page_013.txt
fef65767468453006abd29ee20220d48
aac33c97ae59b6257a100981a957e07f1a8666c3
16134 F20101115_AABAHY camacho_a_Page_101.pro
9f7becaf8fb8dc0d7cab0c760b89546c
5886e3979980e2affdb0777fe12d8c0b9396c5b5
1446 F20101115_AABAJB camacho_a_Page_038.txt
de149cecb9a737f5ecbea0994f2dcd0a
df4c5508892d96438bca6281e08b087cef566897
2125 F20101115_AABAIO camacho_a_Page_015.txt
0d2620e11ac5a1b47531bf3a1e867bab
4a5e1a4dc62b394abeb347289a340cc785744caa
45308 F20101115_AABAHZ camacho_a_Page_102.pro
6f7203b0fea77ae31f8e5ca83a5e0648
7310d48ac5fabca4f14a2636b796ab70fff49b84
1819 F20101115_AABAJC camacho_a_Page_039.txt
b3a4c7e8a30ba837f67c60964c9c4497
77c2712631e4199a573214b5acf68fabbf2cf5af
2016 F20101115_AABAIP camacho_a_Page_016.txt
9f562c24aad4c8299e62439a705cb40e
3c1a236123e61906a096a097b940cb7b28e18efd
1443 F20101115_AABAJD camacho_a_Page_040.txt
6417cdab0ecc87562d72a66e78ecdd74
ba2bd959363a0bb543298465cfa05890e240fac6
2023 F20101115_AABAIQ camacho_a_Page_017.txt
aa105e1d96c25c65f31f6a8f3ec240de
17978aebeb392d0dde086dca433797cce14bdacd
577 F20101115_AABAJE camacho_a_Page_046.txt
5bac8c6e3985e99d8f39889e2652ddb5
9491c94c548b0c7bd94a8b7e51cb9e870a3264ca
1664 F20101115_AABAJF camacho_a_Page_047.txt
de190114b0b36c93b3eb283a8643b9eb
a8deecc7abf71e5e7eb6de84271183f53187ecfc
1505 F20101115_AABAIR camacho_a_Page_018.txt
1c7bca74e0c7470786fe2277d9c15a33
cc25df96ba63befd80eda1e0c53ea18044d2165e
1361 F20101115_AABAJG camacho_a_Page_051.txt
81e539e94ffc4765ee6e2a133c126d98
5189f0295de7d657e60abd3c14932d7fb97dde2c
2160 F20101115_AABAIS camacho_a_Page_021.txt
0b04056c5c18833359cc0f59fa3e5946
c1cc8de7a7182131f3a9f97200dfdb9e42dac8e6
992 F20101115_AABAJH camacho_a_Page_052.txt
e9c25a156114c2adf983add5c594fd34
b35db6d06294a8ef0cc0586be1acf4cfe00c5a8f
1318 F20101115_AABAIT camacho_a_Page_022.txt
2e747f987a4817bf5f407182c0a015b5
0beb0ca2f4a6673a4325833ce2a81f725b1e2484
2059 F20101115_AABAJI camacho_a_Page_053.txt
cfa247f16faecd77f34126e0a767f657
77b3ba5da75e39194f96524760e16edb2552860e
1398 F20101115_AABAIU camacho_a_Page_024.txt
192c0536757074c692c57d3841f35103
1f27dbf3c83450a3feba8f2a19190679ab7b094d
1908 F20101115_AABAJJ camacho_a_Page_054.txt
387701c31f93d733b434dc19ca94fc36
45536ed6c39463e4e76765b3f1b11913b711ab73
1507 F20101115_AABAIV camacho_a_Page_025.txt
fd1194a5143ee7a57afe56e0d23f0c08
f2e3e3c6858658cc23455fd116f2343e1eb229e4
2202 F20101115_AABAJK camacho_a_Page_055.txt
e1fa9a7d1ad312fbc7e63e143f4ede94
f0e51fc74f8daef2b6ffb7f7f711e780f572942f
1415 F20101115_AABAIW camacho_a_Page_028.txt
0c44f6677e2b2da01ff2f534c7606c02
a999c6320871c06d70d562ad18b0e231530f219b
2027 F20101115_AABAJL camacho_a_Page_056.txt
25282141640f72fc4109dec3589f3ca1
75101f11da23e466c9c20a6231beff161e524fbd
1132 F20101115_AABAIX camacho_a_Page_029.txt
cd6db5283b9cd0ae178ef4a146112cfb
917e0f9a60056cae9ca4f80a3951952d5071443c
844 F20101115_AABAKA camacho_a_Page_077.txt
d0cda03c9215d62d994d02b8985f2d3c
b5324af6bcea4cf0c20113be1905b540b268ad7b
1195 F20101115_AABAJM camacho_a_Page_058.txt
cc1a2e283b04b28b33d0d3df2967890a
a58038f25aa77a7f5f8447a79cbb25567f9845c0
703 F20101115_AABAIY camacho_a_Page_030.txt
613245038004bd7dd1efdd49b4e200ef
dcf212444a923693ce2130fed79ae0e5357f8034
1595 F20101115_AABAKB camacho_a_Page_078.txt
c081f9ff162da3e36fc7c3fd7da7275d
034b8f3615292f35feab3de6e7b8b0c9e15268cf
1154 F20101115_AABAJN camacho_a_Page_060.txt
265736fefa188c6f07474bbd909cf7a4
801e4e4e161286a37cf9fa07867d7ed5c4db3cf7
1081 F20101115_AABAIZ camacho_a_Page_031.txt
dc8234d5d9eae3561b0467bd1f189ca3
cf6643c583ddb56ea6967cc25dec94d7c4619ab4
1197 F20101115_AABAKC camacho_a_Page_079.txt
a98a5889e30a5c177966dfe22eee465f
79eb9e0efcd51d6ddf39ea98941ce54a7f6d2b0d
643 F20101115_AABAJO camacho_a_Page_061.txt
f54e1d3115169bf84f345af1c18ce649
be3562c5148f9e01df2df7958e15b196eb1bb1a2
1706 F20101115_AABAKD camacho_a_Page_081.txt
88bb788619d3e13a880bb0fcf8e2c4ce
abb51ac1c1955dc58b8d3905cdd12622e3be805e
1756 F20101115_AABAJP camacho_a_Page_062.txt
900f6ae32454a931f23dc2373b1375e8
9f2e3e0883a16a7b61ae54953d7a028041e59746
2088 F20101115_AABAKE camacho_a_Page_082.txt
aa724d71d906f4d56a81f4f37ca9c6a2
24e97f7b87b0e29d66d0a28002f14d6117c8102e
1762 F20101115_AABAJQ camacho_a_Page_063.txt
c1ab4ddf9e7c02a87f893a72dafdb099
205d2064ea4514ef95e8628fd79f20c60b9149e7
1213 F20101115_AABAKF camacho_a_Page_083.txt
4f7fd3fcfb075df84f4350c013ef51b9
4a96ce67b5fb6436c5e2af9ad49b0f15e4e0a248
1597 F20101115_AABAJR camacho_a_Page_064.txt
05245995403dfade6b9f17334c2b50e5
3b89e43716bf608c7d993c50ae07533530037a85
43350 F20101115_AAAZLD camacho_a_Page_083.jpg
63f01c52b7d1d105753b536b9664625e
6fa14a44c98cc3bdd4e56086d0c760af7799ec2c
698 F20101115_AABAKG camacho_a_Page_084.txt
d4dc9ec3c0e299c21edaa117e849e832
24ee2e372eb3fabcf8834dd6ca6fdbe7ea1f6337
53464 F20101115_AAAZLE camacho_a_Page_028.jpg
75e8a75a48cb94729c5f7b61b4f7b672
8b03fac1842438ba6604cf49c02891cc7f873c70
1556 F20101115_AABAKH camacho_a_Page_085.txt
d97d4cc0dc4c9c38f8119f669f229cb6
dad70c780a0b76827c38b7e29c8848121afb0008
1961 F20101115_AABAJS camacho_a_Page_065.txt
4a861f5afc54b3904a61efa9da01e692
b5cf8d948ef5d286fee165c73467fbf26d6f0389
30991 F20101115_AAAZLF camacho_a_Page_030.jpg
9e65f37b00f1a8e0f1783c563489ef3b
0dfca289902dea90af0c30ed29f33c9ee061fadc
1201 F20101115_AABAKI camacho_a_Page_086.txt
43c37af6168df3c201d69f5f1a64f36b
4e1a5ae133fb26cdbb08eb22b6de8189e42f9844
1487 F20101115_AABAJT camacho_a_Page_066.txt
4b22eebab46d9c5d7a958c08a21afda4
b1a31128a82e5db259e25d70cd33235547aaa99c
F20101115_AAAZLG camacho_a_Page_026.tif
7ce35e64d82e3d1875bd93a6a0234ad9
f684b0d851741534c6604569910fca7bfaf9284c
2597 F20101115_AABAKJ camacho_a_Page_087.txt
888266ccf7d685105a1785409a59878d
4475ce1d0c9d581066d4bb736d0675bb651aeac3
1999 F20101115_AABAJU camacho_a_Page_068.txt
b43995af1beb0b254ffc4df7b4adaa3b
26b6d1aaeb27ba752e3b3cc0875a6caadd938f3a
55830 F20101115_AAAZLH camacho_a_Page_070.jpg
70cc894dbf16497b4d5bbcf67b48989b
7bbb3ac46773c59c6c59ae1e1a8b867847dbf066
2950 F20101115_AABAKK camacho_a_Page_088.txt
9a77d24f09dcf2d493a7d09657b6c411
a758ea73a449492df4fcb52dc96ddebe726443aa
1540 F20101115_AABAJV camacho_a_Page_070.txt
80a3599db0f077ca1607c23bb5416082
4cb41bdd512c0f89be7c8805e06d6fcdebecf579
1651 F20101115_AAAZLI camacho_a_Page_009.txt
3cc09ee7f86b93c2b8e32252c5933ef5
aa783179cb05f4c612d18f2a8e3a798403a4f862
2316 F20101115_AABAKL camacho_a_Page_092.txt
483c44b45a26890be7e44c48929920af
4f43284a4a0c7e84575f2724ccbc6c0d9f0c95c7
1978 F20101115_AABAJW camacho_a_Page_071.txt
e1e16422309c2025d2fe012ebf980843
006092f22d6780526cec951159ad8db8f39896ff
F20101115_AAAZLJ camacho_a_Page_112.tif
01d840b1d91fabc67d1fff021895b310
97b3e7fcfdd2be7fafffcb30022eaa931cab67b5
17370 F20101115_AABALA camacho_a_Page_079.QC.jpg
516c18fdaf85d9294c01c13a1c5608bd
376968f275b1536c77bd635ac8cadd2e7ef21443
2144 F20101115_AABAKM camacho_a_Page_093.txt
6857727dc63bd07f5685907579429a4e
3880513eabead861e43fae8fc0ff2dca618cc876
1188 F20101115_AABAJX camacho_a_Page_073.txt
c28fc83e875338c97c28fe5b4f97b144
72fa08f2cfa462b7c50b6101413867ae8726211f
31183 F20101115_AAAZLK camacho_a_Page_020.pro
0b0f2536672487ee153d87d4dd8cd68a
fe10d1088ac9e46be5e3c6018f168b1c55d10fec
21416 F20101115_AABALB camacho_a_Page_092.QC.jpg
60a98abc1e758f6b95bc6bfa5cc45b28
b3897a1f61c8dafe10cf08bcb687eec00cb4310b
1752 F20101115_AABAKN camacho_a_Page_094.txt
c74643239c279c0ff06f1b2c8732ecd3
e5d4b23a89e33606bf1d5dc6676636da3c8fffae
1683 F20101115_AABAJY camacho_a_Page_075.txt
b81b422a8854cec72695f034bd3397ff
c0f60699a37e2470290cc16d834c26c7f465889e
56634 F20101115_AAAZLL camacho_a_Page_020.jpg
cd4dbc389624dc71f3750a32b84a9b17
5c9a83d62dcb9d3bd408ecd20ba36409614fa5f0
14381 F20101115_AABALC camacho_a_Page_035.QC.jpg
621dd7a7078c71a09e8fcf9760515488
2a3b3348f417707062fc8f18904f243dd0cc40c2
2299 F20101115_AABAKO camacho_a_Page_095.txt
0ecc2bdecb9b683ca934bc2e298b6165
e62d4a4eae251ef501089b9927e3c3a57e9ddb92
1685 F20101115_AABAJZ camacho_a_Page_076.txt
f78a88621b33789e8b91ee260724e4a8
415eed57af0c99a63ab49fc34d98a7f92b792f89
95929 F20101115_AAAZLM camacho_a_Page_062.jp2
284639c6f01c771c8bb7cf5534d0e52e
c7b64b590fdbe83124fb4bb1802c60b56b4c3a11
6568 F20101115_AABALD camacho_a_Page_097thm.jpg
ebe30cf3b653f851913b1aa865629ecd
c9960b88e4ccffa1b6e18e410092e1070eb77be2
814 F20101115_AABAKP camacho_a_Page_096.txt
b0845fa8ea7e939c6552ec1a6c842540
2b47c81183a12b41c76c0e6f05b4007fbfa6f38c
21852 F20101115_AAAZMA camacho_a_Page_094.QC.jpg
5b92b266661d83b4bfdf764c0ff87db8
26805e746ce6d1a568296fcf7be1064bb1bf92be
18385 F20101115_AAAZLN camacho_a_Page_064.QC.jpg
b6017543a3ccf92004131b98568d204e
12ac8bf845602d78cb5d7f0659f7ba656b58cd32
5805 F20101115_AABALE camacho_a_Page_063thm.jpg
36ae35a2e78e8505708ba8523ef1989f
135c140d12b48d5795c4206c71e4c84e87d88de4
697 F20101115_AABAKQ camacho_a_Page_101.txt
cbcdb313bfa5cac789bd13bcdc290efb
43ceada6ae27be6bf0257383f60c8e22868fa578
75429 F20101115_AAAZMB camacho_a_Page_089.jpg
277b535d04386faeaf48d99313d3338b
66612240da5671869f10d30a3e0a245236feaa66
57030 F20101115_AAAZLO camacho_a_Page_080.jpg
fcb7c7f18cf54c2b3bd60a8d9de6b5b5
1a13e883712dd5fc4be41f1d7cf7d2f8dbaa8f74
4071 F20101115_AABALF camacho_a_Page_035thm.jpg
b20bc4d7a53af36034aa1047cba349d6
4445b79e281c27e0ff54106fbacf2998714786d0
1942 F20101115_AABAKR camacho_a_Page_103.txt
32c015b8446f1d2dffafedf09c41dd3a
20fd191ee29580cf001edfcf39012a089f798aa4
F20101115_AAAZMC camacho_a_Page_005.tif
e322d535461650a9d1d98f2fb9f052b1
474e9f53d30557431edaba7e0daf16efa97cfa8f
2137 F20101115_AAAZLP camacho_a_Page_067.txt
5d117bc444ae12484b0f419fb150d90a
f7d2bc10889ac7bfc8fae3c821930236232a28c4
6544 F20101115_AABALG camacho_a_Page_111thm.jpg
299751693adad7041cbb93b361ad2e89
6ff83e669e7db60839d860c47f67ab5634376b5d
2148 F20101115_AABAKS camacho_a_Page_105.txt
869f6826837481e3d2de198b70c9f8dd
9c891c3d9ccfb8252004e6e3e43de36ee383b211
48044 F20101115_AAAZMD camacho_a_Page_029.jpg
1329bebb3e8ff988899bff40e2e6be67
d1d19728ebacd385aa73bf0c4a07ccb47a0a9ff1
12167 F20101115_AAAZLQ camacho_a_Page_084.QC.jpg
b53cfadbd1154be769c5a2251bbd551a
3226c27abf8a7bc87082d8eabebe5174ec5afa91
5569 F20101115_AABALH camacho_a_Page_005thm.jpg
3d3f2415530d2c816bd42159171fb768
b1be082eda9f15187d36a0c63a648e48c8c61471
F20101115_AAAZME camacho_a_Page_017.tif
99c001f6d92bf2a44780d3164b57d85c
222f0ee28be0ee3a3350a084a19f6498a758b492
17796 F20101115_AAAZLR camacho_a_Page_059.QC.jpg
454bb1853b9ef62493cac807bc64ce83
887d4b2180d91c0da59a85a17c9a67ee77bce799
5445 F20101115_AABALI camacho_a_Page_050thm.jpg
60ca58fd6387b6c01c28b8590bf30283
a06dbfcc9a16ac14d359e8d849e89fd4ad99973b
4154 F20101115_AABAKT camacho_a_Page_110.txt
a8e1a1aa9f7ad3092ddc5e1ac03ffdcb
49ad38bd1b967ac1c15865c35127f471a3009df2
53626 F20101115_AAAZMF camacho_a_Page_019.pro
ddb1d5253963f116db3c4f016b4d6630
c490865b0196ee3ce80212cf37dbbd6b78568acc
F20101115_AAAZLS camacho_a_Page_091.tif
f71df8aa5ec9b7007400fa642cb19e2a
25593d1f6d4ff5295a2c7a033f7c4836cadc56ec
14714 F20101115_AABALJ camacho_a_Page_058.QC.jpg
8394e18b5fc83b5c0a543ad6a3409808
0be11c33cacc34154404e3336b41c310fd782bdf
4000 F20101115_AABAKU camacho_a_Page_111.txt
3370c44ef5e320c9dab41df26591d53d
7b4fc448549586284c104dbfe8f766fc7dc6c122
49812 F20101115_AAAZMG camacho_a_Page_068.pro
8354eb8acc8d2baec2169a65fb2ee23f
1b31b68830d9e65de9fd30fea5e29d385c554acf
856795 F20101115_AAAZLT camacho_a_Page_042.jp2
9e3745562bb8a3bc428aa2d4e222bf94
9b435c47cd538b943bd3a8dade05fddf41907851
24230 F20101115_AABALK camacho_a_Page_017.QC.jpg
80625e9011264fb28439f2cf50bcc80c
268509a92da14ccbbbb9caff2c86682d3c6cbfd9
2559 F20101115_AABAKV camacho_a_Page_113.txt
7ad96fb80e9d236bfadf784c7015c6a4
ceedb3b627f4753c6dc15645bdf9d027a7f3928d
82978 F20101115_AAAZMH camacho_a_Page_087.jpg
6b321dad962ac3fc55f0adb7b64fddea
fe07313e0bac5c086c677db260b7e280afe6dd86
17203 F20101115_AAAZLU camacho_a_Page_100.QC.jpg
1608d34c920b79bd6a58086c9ceab997
d36e7e351b26cf7b7722d42149074be47d335d24
22865 F20101115_AABALL camacho_a_Page_097.QC.jpg
55f9d1ffb03a5d418cd0b8260c613213
a7629403894a016a941beff41c790983d13bcc97
601 F20101115_AABAKW camacho_a_Page_115.txt
5be3ce9db2e8b88bd0aabd9f33405f5b
ec78e48a08707d9a4d6cfb813873270ee4f44e50
F20101115_AAAZMI camacho_a_Page_095.tif
c962d29f0d1073760bfb627e5e806e72
01661cb093326e0f02059776d36bff3fa3b22c9c
61223 F20101115_AAAZLV camacho_a_Page_085.jpg
8e3d50effe5e9756d52ada585a639fe5
14bb33eb4b2dd14c163efbf2214bb364407ef595
22452 F20101115_AABALM camacho_a_Page_014.QC.jpg
4a827e42c4bdc38eff1bfcba91a438df
6319484522b6d505a6d5a595247125507e65a01d
F20101115_AABAKX camacho_a_Page_116.txt
2893dcea75c1178a657761f79d16d8bd
1ace745292695f9e39a0f0038ac184db693b2074
15985 F20101115_AAAZMJ camacho_a_Page_033.QC.jpg
c3cb42038aae3c0fff63c02a4a8aa5b7
2a3bf6ebe23228bb86922f8eec9cd9a141e14da0
6171 F20101115_AABAMA camacho_a_Page_094thm.jpg
5e53a97718c419f82b9733ee5e78cd53
de3a8a9dcd35c21566c854164aec4597d9954ebf
37412 F20101115_AAAZLW camacho_a_Page_093.pro
c3f4797c2181d0c49a95b95cfff6350f
278a5ec4e8c46f0185ae9deaa1f52905edd15783
19245 F20101115_AABALN camacho_a_Page_018.QC.jpg
25d155ddc8c52b8649f573c40735570f
90f62aeef2a7ececbc1a4c927a8f85a2e26502b4
2235 F20101115_AABAKY camacho_a_Page_001thm.jpg
d481423a3a4ad10a189f33acaebef10f
c9a06c1c2e358cf24d55f5f41d3b6f44e46749cf
16605 F20101115_AAAZMK camacho_a_Page_034.QC.jpg
f72af367367c06f45ce51b79d8c6185a
5c8428e6587c76b74a8bd3db2b4461fd0114352d
6585 F20101115_AABAMB camacho_a_Page_102thm.jpg
62fdf9e4fa5379a4361992d061b48afd
4c3071feb3a3c0dce8fdc3f08b746b5c27afa29c
61669 F20101115_AAAZLX camacho_a_Page_040.jpg
309a74dce9a0e96a5966b29bf89ca153
1803daf06bc7b66b3d45f682b896b9364d6a66d1
5300 F20101115_AABALO camacho_a_Page_051thm.jpg
4689d02cef665b1b141b2704e3d95ec4
8a8b08d8f4b0f45ae215c5254070332915cd2e11
968146 F20101115_AABAKZ camacho_a.pdf
e9d4715f87c0e3a4d300ee80cc5a603a
2a1a8210adfbc3294ec0061c1410b90868348142
BROKEN_LINK
Object1-4.Square_wave.wav
Object1-3.Missing_fundamental.wav
Object2-1.Bandpass_filtered_u.wav
Object3-1.Close_tones.wav
Object1-5.Pulse_train.wav
Object2-2.Strong_2nd_harmonic.wav
Object1-7.Inharmonic_signal.wav
Object1-6.Alternating_pulse_train.wav
Object1-1.Sawtooth_waveform.wav
Object1-2.Pure_tone.wav
Object1-4.Square_wave.wav
Object1-3.Missing_fundamental.wav
Object2-1.Bandpass_filtered_u.wav
Object3-1.Close_tones.wav
Object1-5.Pulse_train.wav
Object2-2.Strong_2nd_harmonic.wav
Object1-7.Inharmonic_signal.wav
Object1-6.Alternating_pulse_train.wav
Object1-1.Sawtooth_waveform.wav
Object1-2.Pure_tone.wav
Object3-2.Four_cycles_100Hz_sawtooth_waveform.wav
Object2-2.Strong_2nd_harmonic.wav
Object1-7.Inharmonic_signal.wav
Object1-6.Alternating_pulse_train.wav
Object1-1.Sawtooth_waveform.wav
Object1-4.Square_wave.wav
Object1-2.Pure_tone.wav
Object1-3.Missing_fundamental.wav
Object2-1.Bandpass_filtered_u.wav
Object3-1.Close_tones.wav
Object1-5.Pulse_train.wav
Object1-1.Sawtooth_waveform.wav
Object1-2.Pure_tone.wav
Object1-2.Pure_tone.wav
Object1-3.Missing_fundamental.wav
Object1-4.Square_wave.wav
Object1-5.Pulse_train.wav
Object1-6.Alternating_pulse_train.wav
Object1-7.Inharmonic_signal.wav
Object2-1.Bandpass_filtered_u.wav
Object2-2.Strong_2nd_harmonic.wav
Object3-1.Close_tones.wav
Object3-2.Four_cycles_100Hz_sawtooth_waveform.wav
1467 F20101115_AAAZML camacho_a_Page_033.txt
5a0a7469c5b97e72571a77edbc442fcc
8782612e2ccc0947110d8a6f92d1ae0c0267787c
23733 F20101115_AABAMC camacho_a_Page_027.QC.jpg
97d33325b2226fb3b93aadd63f4f4c32
74ea4492d59d998f592dee5c353bdf9fc02e20f1
3445 F20101115_AAAZLY camacho_a_Page_003.QC.jpg
0f65008ba280f21956e77c842320afeb
7acc77a27a27c97291e3e63af8b00ff5727cd534
4573 F20101115_AABALP camacho_a_Page_044thm.jpg
4330c0ff7105394e2b6697fb2cf791e2
d1b6fc4d4f4f0e117bc4b38b2a5d721cfe41428a
633188 F20101115_AAAZNA camacho_a_Page_026.jp2
f987b3ca330f9b40b320e941e9c8e6ff
da37df07576a1646b407fd9c7f92a8b3d14e608b
2307 F20101115_AAAZMM camacho_a_Page_108.txt
53cc8e56ec9a52cbff67bfb13533ed7d
abc99c066763275e81d0bebed2e150b24ddc3845
3078 F20101115_AABAMD camacho_a_Page_002.QC.jpg
7cf9c900357248b85837179bc58d74ac
41aa331b1a06980f3c2ffb66256b47827a13ba3f
7228 F20101115_AAAZLZ camacho_a_Page_107thm.jpg
479605578fea2051ef5dc1d9114fdc7a
7e422224c39b99f54ed6b47082d6460f6ccd5992
25560 F20101115_AABALQ camacho_a_Page_072.QC.jpg
4e54b26852ae4a41c4ccc5f8e1d085bc
d66443e924e212c4ce78601fcca684bc928161c1
F20101115_AAAZNB camacho_a_Page_027.tif
e01441ecc3bf64a96e63a0245b5c4cbb
64848ca1f1663b0fc1ece7bdd99eff6a892b20d8
51379 F20101115_AAAZMN camacho_a_Page_056.pro
e1bf9872fa92b785ac729a4f5a81dbd3
112d25007e8dd139f975f3ee1bc20b56a9f277d5
24686 F20101115_AABAME camacho_a_Page_032.QC.jpg
6799da91ce82ef864554a6e2c61e0ab7
051b6dea832fcf2013d3e312f8e76a403525db93
1903 F20101115_AAAZNC camacho_a_Page_102.txt
43c1ff3ff45e0693129aca96705a29cf
c2baf185ad46fe9db32b66b5f1645cc35db3d37a
1425 F20101115_AAAZMO camacho_a_Page_050.txt
43d6a55e3dd4a9e639bbb8dca012b1c1
642ccdec375fd4b354916579b6790c17ba4f8357
1324 F20101115_AABAMF camacho_a_Page_002thm.jpg
2c7da75cc3331485dd49803d36cfd2d1
81d0a8e7a3463b723377b8d5ae672eed70d90a05
19298 F20101115_AABALR camacho_a_Page_098.QC.jpg
bdba37b83a29f12f18c5800b2a3a7ad2
129f93387c6bc07000f305400e6e9a14ce4d382f
32284 F20101115_AAAZND camacho_a_Page_078.pro
402e15c1d0125d2a6465712d3896e74d
5d426a00335ff664ed8808fc77aa9f6b4e8bdf96
4671 F20101115_AAAZMP camacho_a_Page_029thm.jpg
e9fe59d49d3a17d152944d4091308851
0843eb99191ee2a4870119a36f17f3f76d8ad018
19775 F20101115_AABAMG camacho_a_Page_047.QC.jpg
710d0788645a514e8f588d8e0431a870
d495f298cf4e6e80c95fa826a82f8b387bedd0fa
8688 F20101115_AABALS camacho_a_Page_046.QC.jpg
06f4acb8396df59c7f641222b8f1946d
c00a3537c8c3bc29b311e1ad650b64994acec583
26438 F20101115_AAAZNE camacho_a_Page_055.QC.jpg
ec804c75ce51786b3d7543862867ea16
cf309a3bfe88d42c8656ae204b8ad68b4f18574d
19530 F20101115_AAAZMQ camacho_a_Page_080.QC.jpg
915f6ef164f5c53f42e96861bbc9fb23
e259f2b6936d46654c1c322148c4d7ce93eecbd2
3832 F20101115_AABAMH camacho_a_Page_011thm.jpg
d4b07aa0a3f7b94442dafe92098df317
06bdfcc262ef5859074206aac69f5b705e6fe32d
6828 F20101115_AABALT camacho_a_Page_074thm.jpg
dd32b52a2679447560c4a3f23f39e0ec
268a3c9b8f687c379db760311f6583c0356458cb
831 F20101115_AAAZNF camacho_a_Page_002.pro
084709ac5ae20684260e8fc695332fa3
1465d40b49af7fe5e59ce89b22c75a95c1915c2a
83090 F20101115_AAAZMR camacho_a_Page_100.jp2
66283cd86e0feed18f6e11658a35e0c2
7e92ae1173622a499795ef08d67acce17923bcfd
24769 F20101115_AABAMI camacho_a_Page_056.QC.jpg
e330aec2a8b4d66284086c7dda71fe8b
ec3a40f319af71b0f43b920263da51505e55bdcb
84947 F20101115_AAAZNG camacho_a_Page_098.jp2
6b1d378cf25ae876420400b5aaaee43e
de5f91ef7ccbf9dcf1af5a167c203cd23fefdca4
F20101115_AAAZMS camacho_a_Page_069.tif
f2f4f11bd94262bd5682312ed33c75ab
4f1a6478c931bd8f9658b56d9d95bf62974f06e3
20419 F20101115_AABAMJ camacho_a_Page_093.QC.jpg
6e89d3102f09a8cb00673f19431f2258
b4ecaa9b165f4cb3a4a3c333bd4bf426c3f00211
28414 F20101115_AABALU camacho_a_Page_088.QC.jpg
e0c8f20421ebe58259b7257f8dc4bd50
c07a06518e0d1d3d3980571100d1f97c7d8ef8cb
24198 F20101115_AAAZNH camacho_a_Page_089.QC.jpg
d8772d039cff5f069524286fb2742185
def15ed0825d848278238b5713c0e3a80fc38e84
F20101115_AAAZMT camacho_a_Page_053.tif
51618f1d26693cbc1a64309648af772d
5e7607f23a6bd516d365306d60cde82685e5605b
5251 F20101115_AABAMK camacho_a_Page_064thm.jpg
6a3bd120d31cace82536a656b9022b83
53b4fea7869135e6faedfa470b83d40c22750f98
22521 F20101115_AABALV camacho_a_Page_066.QC.jpg
c46cbf2c4676d4e2735b592e538b7dce
0a7039f5e6a792f8b0948f2eef53b18489ced2a3
F20101115_AAAZNI camacho_a_Page_018.tif
f89779c7691b41d0a0a8e35a9c8d272b
d805ed378595b29f5d8bbafe3b68d77cccaa9248
57769 F20101115_AAAZMU camacho_a_Page_045.jpg
24e860df17a3e795d150ded9f70d65bf
ce42eb0d90cb53749cfe570715199fb87dc51497
20458 F20101115_AABAML camacho_a_Page_063.QC.jpg
0a6865c4f24b1acba3274f7cf910b8a4
9f99fbf3c3ce172dc629715750091aaf6c86138d
7131 F20101115_AABALW camacho_a_Page_109thm.jpg
76dc9b03cb86f2ca8383d415d7b66361
9bb94c3870a60f317840d952d83fdcb7d9326a9e
F20101115_AAAZNJ camacho_a_Page_042.tif
cbd698dad6ed88ff3ddc85b34f6ecb6d
1317c4f7edfe7a3721eda05a2cae94221aebf0e9
6613 F20101115_AABANA camacho_a_Page_065thm.jpg
3eeedf5a9420e8aaee9cf9c4a275f6ab
2e27af594ef63f41b77a32292bc35d9fdfbcf665
130 F20101115_AAAZMV camacho_a_Page_003.txt
8f33de3b56bbbce367efd9e1bdd97b17
5c6859d7c826b6bcae73792b66ab0ec288e30f95
6909 F20101115_AABAMM camacho_a_Page_071thm.jpg
f8b161d7fde973ecad1d5d37f718d292
f6a72988c73a25460a5340ec766a8b13b4e7b359
25718 F20101115_AABALX camacho_a_Page_107.QC.jpg
13f5948d7266e9496f310cba8fcc82c3
99231620bc47183a40145a2eab0306db300a49bc
5120 F20101115_AAAZNK camacho_a_Page_090thm.jpg
d72a13f051dbf55b86d037de97676aed
2b78eaa7b4482e0fa3c6690018019032b813dca8
6134 F20101115_AABANB camacho_a_Page_092thm.jpg
ba89ba927fb0ab7149dc3bffcc107d4d
66ccf2ce027172b14ec882de3d4dc105c6bc25e8
78928 F20101115_AAAZMW camacho_a_Page_107.jpg
084c0ca333fb1a55db08dcb468a4c33c
c5e7f578c1f7005001bedf1d39a9800f7b516cbe
18335 F20101115_AABAMN camacho_a_Page_050.QC.jpg
9559fc58da4160301881b087a70f79a4
989383c94f5a999ba69486784f3e1fcad2a2a779
4512 F20101115_AABALY camacho_a_Page_083thm.jpg
f8f0be4f0a8f33324c8fc08fbefb353b
2b279134d45b82b43f466c25c84f25c6aae140d5
F20101115_AAAZNL camacho_a_Page_084.tif
f9b28a65b4e104250a7a4f52f214952a
27737e8c3a1b6b5592abc98799bb5ac739366712
7233 F20101115_AABANC camacho_a_Page_108thm.jpg
303289785ed81c15fc57a5a658109d41
7031d5bc5f3e4dac32a820baf84230aaf5aa7e6c
49411 F20101115_AAAZMX camacho_a_Page_100.pro
ddcfce009bb0d4c2d33549a12a7683a5
8db7fb77e309cfb289ccf9e7bf595e31c96e462c
32044 F20101115_AABAMO object1-5.pulse_train.wav
d3dd6b3d7ece985ae504e33260666a9a
6097fb34044d5b1208ca58500c9c3b9b8c9ed8e3
7523 F20101115_AABALZ camacho_a_Page_101.QC.jpg
adcb20e52df31dc99b81d627b75144e9
a7baece791dd70b83c846d8af46621b97409aaa2
29149 F20101115_AAAZNM camacho_a_Page_042.pro
5dd55d984197ac56d01ebb13a143a592
4d30cd04f461c938d710d7b13a244170d63ed203
6644 F20101115_AABAND camacho_a_Page_106thm.jpg
ad3576fdb093e36282d9a1fa7d869df8
c1809dfcce1b5255eda4983bf2752067266901c1
13727 F20101115_AAAZMY camacho_a_Page_060.QC.jpg
3952387af07b74e23a6ba23ce165ff67
60ca4c1ef306d0465123ef54b8d153638580a24f
6797 F20101115_AABAMP camacho_a_Page_095thm.jpg
832513e4dad65a7893a5f25a5e6bcde3
8e6dce7693cd68169921507fd7875f1bb07bc2ef
F20101115_AAAZOA camacho_a_Page_096.tif
0dc30fc707d9dba122397aa57a0efc89
bdd433eb5830485ec437f54fa086ec7b3d24b2c8
7025 F20101115_AAAZNN camacho_a_Page_013thm.jpg
7fddcadf00d019dba0a273682a20b25f
02f28f45613f029f3786c24f94d9008601b8487e
17936 F20101115_AABANE camacho_a_Page_070.QC.jpg
384184bcb313a94433ee94fdb0d75194
68e843b19e86c411aa87fe18bc649924aa26d0e3
2067 F20101115_AAAZMZ camacho_a_Page_106.txt
629bde17f2d961d532b3ffcdb2128ed1
c6c5c08b0beabe6dbea4613527e598c56b69d13e
24053 F20101115_AABAMQ camacho_a_Page_112.QC.jpg
73434877fd23425e626bd72ba7310d4d
e2ebfb54f6ccc54baf46fa9895daa6de302f25d6
F20101115_AAAZOB camacho_a_Page_019.tif
4423952e8158340f6d6667286e39d4bc
08b4718380e0da107cf29afb6110eed1f37790f3
1373 F20101115_AAAZNO camacho_a_Page_048.txt
57b8054617476149bc44d46920ea7842
57f9133f0d10435c8bb5250a3a08482177b669bc
14522 F20101115_AABANF camacho_a_Page_031.QC.jpg
e25086b73f6848d96893e7d2a3f610aa
cf7b260d4cde485a0d229dba15fa634bf4c1d3ea
3432 F20101115_AABAMR camacho_a_Page_010thm.jpg
9c3a558059761cd2796c396e1faa6dd3
be61b07038adb5d9acdd24e604a7f61a6e7aebd0
F20101115_AAAZOC camacho_a_Page_090.tif
f6685c3d01588f1a8c61b25bd33c9ae5
a2ff6cf9478e30e1f0720c85007f1821538110ce
133352 F20101115_AAAZNP camacho_a_Page_113.jp2
e2f1cf71cbb6a529ebbc455c82e5ab49
198983d17062026027de34444f9284cf82fdffc2
6403 F20101115_AABANG camacho_a_Page_014thm.jpg
6adf2998c9f6464c91126286c73af79e
04bd4f245fb09fd3c2b6b4ff242b786667775e13
25899 F20101115_AABAMS camacho_a_Page_109.QC.jpg
0fde32698f93659a48a16cb8b1fc83ad
d50788ec3e1fe3f3e126400209e6707d3a73b5c7
1170 F20101115_AAAZOD camacho_a_Page_044.txt
9416ae0d733bb578c7f7b4d93113fb5f
da36e8cee11b25084db45daae97eb5ff5a901656
39339 F20101115_AAAZNQ camacho_a_Page_009.pro
67ea49562b9fcd263844140d9b59b293
2c85aebe5106e142f23c59b5b69525f08b7925c0
4787 F20101115_AABANH camacho_a_Page_012thm.jpg
d8ed28d61ba06e0f3e8d49282199e4ba
b99f69e5e8b358744eaf633c42055d4dc36af160
17233 F20101115_AABAMT camacho_a_Page_073.QC.jpg
c2835b006d893e2676c9be1afc45dcd7
1addf06de9cd862d752c7207b46b66ea4b32a7c5
F20101115_AAAZOE camacho_a_Page_021.tif
694480366ebc6f0b7bb36a8d09a29176
c585289d5ea163cd756a8184f0e6f9591a0074e1
1063 F20101115_AAAZNR camacho_a_Page_035.txt
bf82fa0e8a6569fa614f914e6ab15c48
9e9e57dab63ab0e064c23aff10e02effe5e25f41
6204 F20101115_AABANI camacho_a_Page_043thm.jpg
d84f810cbe9acb2315d805a7fe03c9d6
21164cf23f54ab0d72ae911e46562e8c35376767
23795 F20101115_AABAMU camacho_a_Page_041.QC.jpg
0327a38aeef0a4c47a9cee41ad702630
5d769ea5f80b1cfe8dba28a3cff4657342f54733
9219 F20101115_AAAZOF camacho_a_Page_004.QC.jpg
fdc4d0f8eff28f37416e40f536cc993e
559e84fe65c701221fdcf2df593268bedc6ec41e
29906 F20101115_AAAZNS camacho_a_Page_101.jp2
b5f30a6948fc2fb6fc5a5fb68106f88a
528a407957b014ecf66453e7529e173307bcaf6b
17173 F20101115_AABANJ camacho_a_Page_024.QC.jpg
0e0caf1d902efef2f3469bfebde5589e
75151e1257a3160dec122a46de42e393da04c0be
6594 F20101115_AAAZOG camacho_a_Page_041thm.jpg
aa72364732df20fd6d4db5fa93757b10
2ef9ef405157ddaad5e41b5f2b584911d8fb3153
3290 F20101115_AAAZNT camacho_a_Page_007thm.jpg
dcc92a6b42251c98a53163e2dd2012fc
46c29447d85678850f218c6614eae7910d2fcff9
7093 F20101115_AABANK camacho_a_Page_001.QC.jpg
768b96bb0c62356cf7a2eafa580cdb2a
51f37766becdb694b24430d867f9a58f15160444
11321 F20101115_AABAMV camacho_a_Page_007.QC.jpg
603c994274bb73691df821ba1f412ba4
3a59ba4d66a57be566130fa64c4ee9444156480e
47766 F20101115_AAAZOH camacho_a_Page_014.pro
2d82db3797e7470eacc760d90145ebcd
665f9d2b20dd63c15304adbf725597bd29109f49
53031 F20101115_AAAZNU camacho_a_Page_079.jpg
96fa49a65bea78d5ad6d5db9e744500e
5e74e69130a4a8ffadb201736d24779ecacde37d
4906 F20101115_AABANL camacho_a_Page_023thm.jpg
72ff4bbe8d7e7fccebaad600902d6b0b
75a3743abddf96fe1384188dc42d0e18c74fd3b8
6373 F20101115_AABAMW camacho_a_Page_066thm.jpg
a7293be8735d3a6deecbc8f768a9d199
d53ead2f170a3ff71d664619ef4273ec6608456a
1511 F20101115_AAAZOI camacho_a_Page_098.txt
9e83ae05f0b98eac233a6cdea2ba30d5
984e4bb3e0bc1e011bee925580b9a3af182b8895
1956 F20101115_AAAZNV camacho_a_Page_014.txt
58edaad24dac8778d154337d5a11a8d8
6ca69a168d8f515b4ba620750f7332868c763899
5881 F20101115_AABANM camacho_a_Page_062thm.jpg
c023c2edf022ef22d86b0a4f7763f3f0
301a222be9378cfc1b1558e6ccc47171ff3fe279
F20101115_AABAMX object1-1.sawtooth_waveform.wav
1ec731906a080a61e4f1173c184cb60f
9286080583df5f5466f1d3adfd9fbc6ba47e5407
92102 F20101115_AAAZOJ camacho_a_Page_006.jpg
2a2b76e00d330fde20f904c81409190e
de8bb6164277d4986b65f77d524e80c20e8c0601
5105 F20101115_AABAOA camacho_a_Page_045thm.jpg
a83b23b91a9081f9e68cc4226d6eac20
538eacf5be3cd5f99017f26401fb80f30c1a766f
49557 F20101115_AAAZNW camacho_a_Page_069.jpg
72dea4e00afa9857902e6222e3ebddf6
322ca8d6b9dea03dc40a01cddd474d1ef1327c4e
16025 F20101115_AABANN camacho_a_Page_026.QC.jpg
5da659960f69ef4f57f8ad8da68376a5
dcb5532a22174c499fa507304c2f3bf33be4c78e
6954 F20101115_AABAMY camacho_a_Page_027thm.jpg
832237ebbd4edf9125e20bc5e86cc2a5
2f63598ef9e690c6d0b4e57cbe4e47626e59c7d5
52910 F20101115_AAAZOK camacho_a_Page_034.jpg
82278e28143c3b04067a6e4925e9072a
37e61b06ba217fff08979243abdc1c4c3e8d7868
8498 F20101115_AABAOB camacho_a_Page_115.QC.jpg
9d0053d8753c3b45679f3df3bf76b3d5
2394f170c2693861ccd6d30fe281cdac10bcf4de
F20101115_AAAZNX camacho_a_Page_041.tif
f7207b3fa4cf3d076d1d6a75cf019681
adc3b62165fce2793a972f5bc8021707dbca1ea1
3131 F20101115_AABANO camacho_a_Page_030thm.jpg
57f8c58dddcfe410f6507e5ac98ef3e4
965f2baed13e8a8873063b7a0d0942145ef417c0
15876 F20101115_AABAMZ camacho_a_Page_038.QC.jpg
c2b5961dd4645d558420ce25dcf026f4
b8d83eb1fbc4b5330cd67fecf5eac1c26688363b
108323 F20101115_AAAZOL camacho_a_Page_106.jp2
c5c8849e164d8b8707c1845d52e09c69
4466af8f939638d812d791c35fc057cfe7a5ed64
4542 F20101115_AABAOC camacho_a_Page_031thm.jpg
cd0137aa2c41091bedeb48678aebd21d
a43ee7a83a79de632b053e5d5c93f20c25ac64a7
101482 F20101115_AAAZNY camacho_a_Page_102.jp2
c6e10e6973b916b8dfdebe9eaaff4f93
8cb9fef51c3586d7912d4432b382cc47210aad06
18004 F20101115_AABANP camacho_a_Page_020.QC.jpg
477d60de3db95ebe5a9450b9e03f7548
3e7022414046b24fa605d5523f93c83ca4561f78
28056 F20101115_AAAZPA camacho_a_Page_022.pro
28048c1998c85f4c2c529cfaee741673
e47bb668999cf69c223096e608f4cc826db7c213
50735 F20101115_AAAZOM camacho_a_Page_009.jpg
583080a5f377c5a0ca5b0d3504c87409
52477683d3c974d01c13a6490f8e65a192407d4a
7096 F20101115_AABAOD camacho_a_Page_021thm.jpg
3818dda299f264f77958b8480d06e50f
391c9e9c42572b5782e97e62018fce4a84c517ee
20275 F20101115_AAAZNZ camacho_a_Page_054.QC.jpg
763c6cac410e53689c33c58c1f5d9220
91f7513dca097dec82b051781384e902e810c9b5
23834 F20101115_AABANQ camacho_a_Page_006.QC.jpg
4743790f25a4d7424c06d8e5d531016b
852f8ba652c8ba596a5e0b4031ef7f25adf35975
5132 F20101115_AAAZPB camacho_a_Page_059thm.jpg
1454193a013fa645d06442c6bd05d4b3
aeb15fe286c43c27a0172a849fae263e553bcb8a
4700 F20101115_AAAZON camacho_a_Page_034thm.jpg
addb15b41fe148ca3ecfa7ef73704744
b775645d004a68aa21e24bb8028a9e9f1dd119fc
19538 F20101115_AABAOE camacho_a_Page_040.QC.jpg
b45850390d672986ce0520e8248c0b4d
c5c959349e376776e2a36729fa3f77a863c04f0f
7181 F20101115_AABANR camacho_a_Page_019thm.jpg
1ca0f03020a977dae8dc8d1e5b19a6f2
d206e2146b5c4dae8f76a177091b0cfa169c85f1
6906 F20101115_AAAZPC camacho_a_Page_053thm.jpg
585da70307fbb47accdb9d375820fac0
c9ff6989e4a7a1422a11b4e02351fc2fb2f1ae4f
F20101115_AAAZOO camacho_a_Page_058.tif
28eb357ed1cc81603570df2ceea2249c
3670c1667d098fbbbb56620ed5f7c3d8e6141f59
4744 F20101115_AABAOF camacho_a_Page_033thm.jpg
9c3f30bf428e70c0953b0bb6bca8bb05
3542ad91b77027f84967c8a4caa83401ee9039c1
4630 F20101115_AABANS camacho_a_Page_069thm.jpg
7b3b2357ca8c3c9d32ff9d03b1e6f0e4
4b339405ba233821b9acc073be56cdf29a0679ec
57020 F20101115_AAAZPD camacho_a_Page_116.jpg
c9d91bc1b2c8d3cde80c94995ae5d91a
bad714b7b3e2adca79e1b409a812672dc5221e7d
22460 F20101115_AAAZOP camacho_a_Page_057.QC.jpg
a082febd6221c176421c0ba70baff989
e3589ff5943277c967e41e1dbd6ab15c75db7025
10718 F20101115_AABAOG camacho_a_Page_096.QC.jpg
4878b444a2417d1015f58ef861b974d4
b803aefad2b27981901f9da94f8d8904e00bfb3c
10796 F20101115_AABANT camacho_a_Page_061.QC.jpg
b20465c10621120abc2fda1c40ce3f32
58c84c027031e05d03b3ffe7997bb070b80b806f
6874 F20101115_AAAZPE camacho_a_Page_104thm.jpg
6447e496dbfa1a3cacb45294c9c4bc81
50eb9b4c317008d0257add862894663af4bafea7
1628 F20101115_AAAZOQ camacho_a_Page_091.txt
bc2cb75f00c03857c0d01173fb8e9830
7e1e4425640dab072190ed25cafd34e4e3bb26ed
14982 F20101115_AABAOH camacho_a_Page_069.QC.jpg
312e9ab5f809a1f89ebd69504be4dfb4
a57cbae30a3b678197c20d83b57c50dd1148fd87
4780 F20101115_AABANU camacho_a_Page_048thm.jpg
b09cdf7441df61cc2a280c2458e281bd
47e4bfb2bc5c113ea9e5f97e0ecd083829434512
63932 F20101115_AAAZPF camacho_a_Page_113.pro
8ae0f79bb54d53d983489ea45a5cb6a4
c7f949d733c3ea4fdc3370be970cc6a866a9c0b0
7195 F20101115_AAAZOR camacho_a_Page_087thm.jpg
54177c08ee4fd08c9eab173e62048130
d4a73de0c82483a606bb69e3ca53d004e3a2b010
20003 F20101115_AABAOI camacho_a_Page_075.QC.jpg
789dd07cc3216e3483cfc72655e74983
f8f5298a14c05fb49672318dfb3089fb2c0d1b6a
6663 F20101115_AABANV camacho_a_Page_068thm.jpg
18db03dfa4b6f20481d4847046ace8b4
93cd24aebd9b23ae03fd8cd8786a2a1f29e0e103
1234 F20101115_AAAZPG camacho_a_Page_026.txt
2dfd092dfb43dbc8b22b7f97ae73069f
3cfdba620386f5b6edccda459496a3e9b315ced7
15800 F20101115_AAAZOS camacho_a_Page_048.QC.jpg
738584ac790e16335ba74a8aee204877
e65e2ddf48cb73953c9152d509b97969a3481e97
6834 F20101115_AABAOJ camacho_a_Page_112thm.jpg
d42b04a97ec222291d63a6e0321754ad
053dc4a43f67b326a8ab5b8599c76adcb2ebe6ab
195298 F20101115_AAAZPH camacho_a_Page_110.jp2
03241a2eb0b5343ceeb9aae6f05447e8
8a33c3f41bf6a8317803049a540b98d3c3e89f90
74031 F20101115_AAAZOT camacho_a_Page_016.jpg
55c93409b310c8824a8d0023177ec3dc
1b6a01a832bb589c0fd339184aaeb1a3a06e16f4
4702 F20101115_AABAOK camacho_a_Page_038thm.jpg
1662d2a532e36f0fa00729cb728b03e8
ca32425d9c32f922b16d722b1982d8e43f491292
5918 F20101115_AABANW camacho_a_Page_047thm.jpg
fdddcf4edf55e7f25fd1f30844d8df72
93dfce9b82669010ad279a5999bb2a57b072739d
26097 F20101115_AAAZPI camacho_a_Page_067.QC.jpg
bcad953301ce9bb483d5abbb661a07d9
645037fb43f8642b2d602d1de7e1f86bf452340c
1470 F20101115_AAAZOU camacho_a_Page_037.txt
a882ee850f94c9564d3ebc551b0e2a40
90a919368c3dede40b4e89290e07f94a84b73b4d
F20101115_AABAOL object2-2.strong_2nd_harmonic.wav
c99f67981df3d2b117bd7e5b8babcc3e
951cd73f6fa465a9f7788f0ce12dd0aa7e94ca3f
3367 F20101115_AABANX camacho_a_Page_061thm.jpg
6921066d95fe5b9545c5847702e31db9
c66d898f39187e7c73f24783f8a3930f6c1372d9
2133 F20101115_AAAZPJ camacho_a_Page_074.txt
cb1a177edb30fd84e76020a5f804a79e
df7f2490a47dbef2ec9113405489979cc84c814c
6608 F20101115_AABAPA camacho_a_Page_089thm.jpg
91f71615a98105c40603df19ea33a871
d708000d7bfba4b4ca108c726ab7b966f5419b43
40501 F20101115_AAAZOV camacho_a_Page_010.jpg
317ee7bc92c34d954ae157cc1b34be2b
6b4d2e4754302e6b7b9a0b447279d341e09bae81
4987 F20101115_AABAOM camacho_a_Page_052thm.jpg
b7fa128d2fd56206854a5cf4c32a17f1
c62109b3d7ef0d916a6faf81e72865b3eb9b3901
18406 F20101115_AABANY camacho_a_Page_116.QC.jpg
f2b419e0e38490d7cb3f32676949d83a
37d7ad02112124beebd428839e16be2e2e6006d1
17128 F20101115_AAAZPK camacho_a_Page_052.QC.jpg
5afe3218969c2fb4ccbbc7cf181eaa7d
bd4c3b72f9e07dad98aad75634f3e67cb39af169
4963 F20101115_AABAPB camacho_a_Page_073thm.jpg
73f9c42df2dd55672f9eb6ec8733b5cf
3bfeb1c29bbac40de404f020355de752bfad6591
58461 F20101115_AAAZOW camacho_a_Page_083.jp2
f06eb066339e31911d9bc7f51f6fe067
71f3613906fc1d44d50f043dd66bbe317eff47a4
5648 F20101115_AABAON camacho_a_Page_037thm.jpg
813082a7aae23d5ecdd646fe7334b933
b5f0afae3e91de39a82cf80c0928d58fa90b0ca4
25721 F20101115_AABANZ camacho_a_Page_019.QC.jpg
4594d34dd6bc2b078c06ab7b6fd8a549
90200127a2290475e896b26f8637e57ac4862b59
25325 F20101115_AAAZPL camacho_a_Page_053.QC.jpg
ad90032f9e1175130312fd097474c0fc
b1c6174f7074dda98a84b7fd373f724875d6ee65
4559 F20101115_AABAPC camacho_a_Page_058thm.jpg
7a755c4165d67c44872d89e790bf55dd
928cbad7792a2812c5ab115221b2096817966471
F20101115_AAAZOX camacho_a_Page_075.tif
40199106f1aa2a12a0f17065c553497a
2bb954beb6573175d2b495a2000cb8d845fc9923
18034 F20101115_AABAOO camacho_a_Page_076.QC.jpg
74c69b0c3fec64939939227d6abac46d
c78d0147f24278b6a4f03eed138d384b428becb6
813997 F20101115_AAAZQA camacho_a_Page_085.jp2
d66676ddca21e48f62bb305d938dab9f
94ac22bcb3464a767e778d8aa750fb6cadc3c17f
1930 F20101115_AAAZPM camacho_a_Page_104.txt
682ae90b71e0049e55e879258e43cf25
bb698f8b3bdc96655837d217ea9330c83632b607
19369 F20101115_AABAPD camacho_a_Page_025.QC.jpg
c9288db7af3625e07a4071c7408bd044
7ad41f75993311bedfc78cdc08ad2e1f7c1ea629
51965 F20101115_AAAZOY camacho_a_Page_023.jpg
78df365430208b8f0380576cebf0a654
04378438640bf8060ae5e159ee37fc87911dadce
22874 F20101115_AABAOP camacho_a_Page_081.QC.jpg
2390df52d7a603506a1496ca5b6da3e4
2fb494e22c875b42e7c8fbac915d9a92fde2df70
15238 F20101115_AAAZQB camacho_a_Page_083.QC.jpg
a4e3de8be2b4a7fb693e8fb8680aa300
a0ff413cbe7fbdb619d9189b5720900f46c2529c
F20101115_AAAZPN camacho_a_Page_044.tif
e5069491c1212178e7542e4e431265a2
547343c523538c1ac31a03e17942e62d77d4884e
21271 F20101115_AABAPE camacho_a_Page_005.QC.jpg
9bfc902113fe2376861a5dbb2892ea61
f394843d87f972808bfae85d7e6e8da9397ab332
22546 F20101115_AAAZOZ camacho_a_Page_001.jpg
e34c4241a6894c85e6294d9a87fabdc1
6ee296bd0dc9f080e4c8489b9fe7741460d273fb
5854 F20101115_AABAOQ camacho_a_Page_099thm.jpg
27de292b11865e99fd0dffad7e242397
0b0ef474b619f56885b4a3d708eaab578f34abe9
2182 F20101115_AAAZQC camacho_a_Page_107.txt
76a12dec4636a3f047836c305b433987
8ef99e1e26560a537beea178bbc8a20b48d4e4c8
62359 F20101115_AAAZPO camacho_a_Page_037.jpg
90b08625790eebbecb105cfe6674d0ae
c711c05be422bb0230569c7c16c7666abb114bc7
26004 F20101115_AABAPF camacho_a_Page_105.QC.jpg
7d1d1120f80bb70516daab64d90bf45b
b164c8db86a10d789d754e287d99958d25a1f990
6550 F20101115_AABAOR camacho_a_Page_081thm.jpg
d76e9f630dbc710fbcd0a6bb7ad95304
735df56f9ae836bea5f7af0815c27b86353a4d65
1533 F20101115_AAAZQD camacho_a_Page_080.txt
3fd0d7ba871aaab96e566c76b78f74da
05d89914aefe973681f4bbb257b8201c570d8b02
30143 F20101115_AAAZPP camacho_a_Page_064.pro
08f197671c258a8099503a2e3bfbda99
3c4094c294fe93ae5827fab2a51531f925b30e44
5414 F20101115_AABAPG camacho_a_Page_116thm.jpg
d61b3868c13863bb617320a2e939dc16
ab93931ffee19e3ad55f31b29822666ae083b248
7128 F20101115_AABAOS camacho_a_Page_055thm.jpg
2ec82d715bb11e7ce68a46e27baa4111
a45f654fcd073e33a63ced550f204c78db67ebb5
F20101115_AAAZQE camacho_a_Page_098.tif
4389c8ecd209815a67656245d2ae71b1
22caaa967bdef289b802daac42b3f11526cb2a41
59466 F20101115_AAAZPQ camacho_a_Page_025.jpg
f4942a9f8b46dcb82633716948e540d1
5e82f2b54e551b80f5954ddbfee192dcc24957de
26266 F20101115_AABAPH camacho_a_Page_114.QC.jpg
71af2eae0c6ea99b481638cbb0a2e267
d2be0c2e1d82e8d6ef1ce2429db97828112f9d8e
17942 F20101115_AABAOT camacho_a_Page_045.QC.jpg
27a202ba47c76a200a81d4fbfdd81bd7
df4666ab35dc68a7dcfe3969573efb5d7bb311ad
30558 F20101115_AAAZQF camacho_a_Page_040.pro
4bb7c43c11ee875b5d3e80d6364e29f8
1ece4d33648ffbe940e3feab6a614e963bc5de77
672468 F20101115_AAAZPR camacho_a_Page_023.jp2
0030359a01fcbdc37ecac8b423121ec7
223d562ad065031cf57cbe3d00781a46ccbedb21
5408 F20101115_AABAPI camacho_a_Page_049thm.jpg
f003e98162033ff7788ca5b2b2acfdc4
039c73bd3e6b305fffac4b8ad3769fc06fc4b52d
7417 F20101115_AABAOU camacho_a_Page_113thm.jpg
bf8aefb804623e3a9edf4b1a76a1ad0e
8683f8f69e7aac6d97decb08a60e7a5cf2604970
2713 F20101115_AABAPJ camacho_a_Page_046thm.jpg
92ad2b01c718504f30aa49797141259d
15acdd51bdff77da92f8b6c2401b55571aadec7f
24457 F20101115_AAAZQG camacho_a_Page_071.QC.jpg
6a811edace9d700d0b86b4ff767a6533
e1cf924dab18b7aaaa093b3cda9e243479f3d417
95083 F20101115_AAAZPS camacho_a_Page_111.pro
7d864b0ba67d63cde2be9ffd0fdec0b4
aaa4a15eeaff2437414e07c3a9e51b62a4cf26d2
5113 F20101115_AABAOV camacho_a_Page_024thm.jpg
de73227e14c53f4a5b624c5ae789b23f
c4a469a83af08707e1ba2c2f6a01fd250dd2df5a
24119 F20101115_AABAPK camacho_a_Page_065.QC.jpg
5300eeab03be4645d1c19442f223ccaf
0c6b71c32b70243cb8e9f611df2ff52a744d85fa
38122 F20101115_AAAZQH camacho_a_Page_054.pro
eac1ab60afb5a977ba9f388e2530c51f
57ef7aead66625113943b219355e0dbd67ac0f7f
1931 F20101115_AAAZPT camacho_a_Page_036.txt
a2dd5c8dc64577409dcf352a0d60928f
f0fc1caf898979cab3e79f0678e2089cc46744dc
F20101115_AABAOW object1-6.alternating_pulse_train.wav
2b03e969c6907be7fd1b50da58dcde57
ab3cbaafaef36ec758c88da4e55909aba3fdb579
7051 F20101115_AABAPL camacho_a_Page_105thm.jpg
eda81aba2a16cd343d9cf1e4d5b757c8
52c5a6182d63ce46122c2b754240a29eb5ce1aee
92 F20101115_AAAZQI camacho_a_Page_002.txt
8faf7f460e7ab2b68de8a1536cb5e0ef
46aed2cafac130ce54cd68c8b5eac59b44c76a1b
F20101115_AAAZPU camacho_a_Page_077.tif
cdb67c50296951214a9f3a8f38e8d997
b47e3f8dffef08b28d29c704a9e0faf8c6e8caeb
9940 F20101115_AABAQA camacho_a_Page_030.QC.jpg
f4ce79702c69671539fa42bedc298d75
88a32de945e26ae94b767a05f705f05ccdfb277e
23812 F20101115_AABAPM camacho_a_Page_102.QC.jpg
27ac1712782405d6b0d29e32acd1ece1
4af903090ba816655a8f76421a69358acda10849
53761 F20101115_AAAZQJ camacho_a_Page_076.jpg
837bc93f0340423041ff35b6e6c852c4
2895eca006761b86904a3c269c8f52f2d10f4ae5
23765 F20101115_AAAZPV camacho_a_Page_069.pro
8f1ebf0d3c84531b093391e61b83af1f
1221fc8e977422bb1389b2f6310378c3472a5c65
26694 F20101115_AABAOX camacho_a_Page_108.QC.jpg
23ddec1b486c4a737e2e2bb4266508ea
573ea602a0bd0bddfbb01c51c23cd6b577182558
6768 F20101115_AABAQB camacho_a_Page_032thm.jpg
b2f79d6ee312b79f86f6f282b6e941d4
8b32f77d6883714d17cc25109d1a35a049eca302
5771 F20101115_AABAPN camacho_a_Page_054thm.jpg
8fc55d8047314c257e1c02d528b8b93f
ee046e09a2cca9236826d35716d8343f9d98669a
57364 F20101115_AAAZQK camacho_a_Page_051.jpg
6cf2c5b2818936f8674cce0d9db230a3
af5f9faafcc37df1814115c7775cf87941d6b58d
114156 F20101115_AAAZPW camacho_a_Page_082.jp2
53fc52e25c83ca11ed5889bbdecfd404
cd5a0b1abbebdb98cc7fe9f0b0c9c561c201cba0
16966 F20101115_AABAOY camacho_a_Page_023.QC.jpg
463538f45f98e970193fe7bfb7ab6016
4c369fa8af2f21f8d47c4325acaa9d75437d86e8
22998 F20101115_AABAQC camacho_a_Page_036.QC.jpg
0a90f23e4e6d34da7943dc8c64c82ee6
8f4f87064984ca5adc0b32a5241c9a097dd1f843
21059 F20101115_AABAPO camacho_a_Page_077.QC.jpg
7099b1a66fb05603bc8c6b445d22ebde
82f13df85642c95fd7efd0ddde52e73846284e6e
F20101115_AAAZQL camacho_a_Page_020.txt
ff602fb01101afc4a73a77b5108826bf
9b9e08eda95e7e86a42fa0649ccd8d62909ed7f6
F20101115_AAAZPX camacho_a_Page_057.tif
2a1a37b4d68f98986c184c32807c717e
37183c9c7e06caee64b179d76e5e1c1468c58b29
4313 F20101115_AABAOZ camacho_a_Page_060thm.jpg
caf27373d1c5121077a0a50005705ac6
1ff460290b737f6ab07a03dd86af3dfd4d1091c6
6656 F20101115_AABAQD camacho_a_Page_036thm.jpg
0df7bfeaf80bff3e8aa6e479c8fafcff
083730a7c58f3eceb43c172ae22ccb1b006dfe0b
24405 F20101115_AABAPP camacho_a_Page_013.QC.jpg
f5c00d085626d752fb3b5b7c2f77acee
cd6e0cb940683d3bca6e23bb311f6e750f89621e
24654 F20101115_AAAZQM camacho_a_Page_095.QC.jpg
0775ffa575f1b8fa3427db0f6ed0bfaf
3198c84fbe49f157313c50d2bcea7165af2e6063
6529 F20101115_AAAZPY camacho_a_Page_057thm.jpg
448df8f451a616d430bd0529c98edfbf
61e0f1cdf829939cd7185ef87e3a74d2e81c5640
1391 F20101115_AAAZRA camacho_a_Page_012.txt
adf0979dd2463e13f558d728f1825651
d5462def9eb869fadb45f940b135c8a0acd1b077
19650 F20101115_AABAQE camacho_a_Page_037.QC.jpg
ca3a0ae78275b19cf24018cdf47d41c7
3004974b88b14656516cba850e0a786c1fa43c10
176401 F20101115_AABAPQ UFE0021589_00001.xml
241b0517898a06fce162c81f5ff85106
c33491f7d2e5fc798dc8c9aa9061554d896457c8
14662 F20101115_AAAZQN camacho_a_Page_115.pro
b5be27fd2c822178eb9159d744b0c4d6
c19e1754a59d59fb06959f87dadd279d70200965
F20101115_AAAZPZ camacho_a_Page_087.tif
43d23044f38a823c81c27fd2b5b639ff
fb88cc8b9522b10a924473e8f13abc4afbbd1707
F20101115_AAAZRB camacho_a_Page_052.tif
a3930f4db33c417b8337bedb10338eb9
d30d231bb471224ffd9a6d156bc09d99a7ba6c0e
22090 F20101115_AABAQF camacho_a_Page_039.QC.jpg
0f60c2487dff4a068df7ed2010808ad9
e5b373f24df540e3e91c37c751a182e268aac66c
F20101115_AABAPR object3-1.close_tones.wav
e3b38dccd5f1c8326691a3d4a50b0a5d
17aab19ace906f616dc2c1bfe7ce3227272a6a02
49097 F20101115_AAAZQO camacho_a_Page_026.jpg
b54cea4b8396f1f6e4332160ff828797
7663d4c10139debcab0b83f03b5e843732d007a6
78475 F20101115_AAAZRC camacho_a_Page_095.jpg
615bc886cf05a927f790cc8740121c74
a45c651e991ef640bdcf54df4cd05d3ae40abed1
6332 F20101115_AABAQG camacho_a_Page_039thm.jpg
c8279bf618dc42509ab2ec0bfe372734
69c6749dbbaf0ba0f0cea215e0e16733edf7796c
F20101115_AABAPS object1-2.pure_tone.wav
552933f385f108e718e6b27eb69bed68
bc5f2b95e0f9d009eae11c29b6171bc0a438ca41
26239 F20101115_AAAZQP camacho_a_Page_048.pro
8d60f0198674f41ea6c7cdbb2f264ec5
e4b01d09b995ad6f754f1c7d40dbf17558f96afc
7405 F20101115_AAAZRD camacho_a_Page_088thm.jpg
ab090aa3db117d65492c8165fa249d2f
ec01d6f3e96d2a0efcdb224b29eab9a4ae896a34
5383 F20101115_AABAQH camacho_a_Page_040thm.jpg
7f48913f7c12b81f3f97bd50d051a616
396a044554fa04c688d61b23c134ae9b2f6fd8db
25393 F20101115_AABAPT camacho_a_Page_015.QC.jpg
d23fb01500ee7fd5b270dfcc86318a13
b8b75a6e76360981dd523fb5add3d07eefcd6113
527111 F20101115_AAAZQQ camacho_a_Page_035.jp2
4727d8d37f3f7dcba669c6513b80614f
b97de89fe2c4311de665ed942dfa121389fcae64
F20101115_AAAZRE camacho_a_Page_027.txt
865a45ebd8d5b7e861d5f89a163153f2
f17c7c86f656f6d8c9456f972a81b0a4fbda4263
14870 F20101115_AABAQI camacho_a_Page_044.QC.jpg
ca2224deb669b71a2dd481e2f41a935b
51a461f75822c15995027f32511d8a967418249a
7110 F20101115_AABAPU camacho_a_Page_015thm.jpg
2eab7770b92c33db8683c67e47893d4a
28750448fe02e91cd5b849b476efa1d8a5b5f765
125589 F20101115_AAAZQR camacho_a_Page_114.jp2
ccb1e24a59f57abc7c61bd7294b4c52b
d21eff2bed7383e9a290517127f5372be84871ac
2122 F20101115_AAAZRF camacho_a_Page_032.txt
fada0629da1b43fc40342a9e0133ed77
31f572cc4db6d2512753c80a9c0aed5516676845
18205 F20101115_AABAQJ camacho_a_Page_049.QC.jpg
196c59dda00b055f8e8d8436f3d01207
7a161a12cc7971135b047512e712c39f3e3890fa
24896 F20101115_AABAPV camacho_a_Page_016.QC.jpg
9e44eadf0c911f9cebad79b167ad22f9
91889f52d3158c10c96898ddb6f77ae97c876d8c
50837 F20101115_AAAZQS camacho_a_Page_089.pro
cce1fff05902b7c62d18cf473af07a43
9170f7b01bab2177fc474d62befa0f00459589d7
23600 F20101115_AAAZRG camacho_a_Page_106.QC.jpg
469b16f11044cfedeefd222a7d0ae127
4ec6b1e65bde79ce9a732a1d4ba8334ef6c159ee
18667 F20101115_AABAQK camacho_a_Page_051.QC.jpg
045c843028ad6bba885cba18a2d551c7
ee4f7ffbcc280d1fa39da91f9469d21c46b28639
F20101115_AABAPW camacho_a_Page_016thm.jpg
32d06035432529c3e077a516a07b28e2
dd724700b1660badfc552c715427ddf06301fc11
F20101115_AAAZQT camacho_a_Page_040.tif
365909ad5b11f3e8b8a7c2c9810d1585
b8286ecb5792bb5133b12ebdc750dd77396538c5
1441 F20101115_AAAZRH camacho_a_Page_003thm.jpg
06c2ef212feea575d3efbc16fbdb8c52
185a5d005bd27e6b613bfddf16dcba663768bd94
7070 F20101115_AABAQL camacho_a_Page_067thm.jpg
b02653704f7b854af3124645c83415f7
7da05aaa6a49c4bcab3803d106d4c991131cdb07
5031 F20101115_AABAPX camacho_a_Page_022thm.jpg
7ebfd7884345d8a3856a50801cd9ba1d
39046455de0a447b623f0673691e178a4740afaa
2302 F20101115_AAAZQU camacho_a_Page_112.txt
bbc7bd934d4ffc60f81fe396462519b6
2241cb5d6f757d23fdd15fa8f7657da7a2a680aa
F20101115_AAAZRI camacho_a_Page_012.tif
475cd7bbdb37e4280cad8af9f7bddbf7
044bc401404115e045194649e46d8da4e4236938
5670 F20101115_AABAQM camacho_a_Page_075thm.jpg
8055551173f51eb67c16c5e1a9fe2e0e
93845fcd9de717648759198d5cdd9ac74ce8868d
29787 F20101115_AAAZQV camacho_a_Page_010.pro
fabe3353b7b5a92645550d2cb862044c
b3c89661a595bc2c460a6157ed0678b181c334fc
F20101115_AAAZRJ camacho_a_Page_033.tif
533e657c4ae3923abf01da63d18cab1d
89e02d5e5707593ba101980c371216a1a3aaf9fc
5297 F20101115_AABAQN camacho_a_Page_076thm.jpg
e76629962f7a093eebd03eccf889ebed
55756d46221f473afa839f5ed9a80b3db1dc1819
5561 F20101115_AABAPY camacho_a_Page_025thm.jpg
846ac1a28a8841273169ac3b2a7888c4
ea87220f28651de1f0c6ebbf1523fc34e2fd053d
F20101115_AAAZQW object1-3.missing_fundamental.wav
1ffea733392e00178c5c927105d4e256
16565ca2bd62cd31365fbc49c4212cfd8784a2db
4553 F20101115_AAAZRK camacho_a_Page_100thm.jpg
8e3dc1c4c7c8037d977bd09f56ea2c11
f1294c1765c625c54ffe6de005e77df470881d99
5586 F20101115_AABAQO camacho_a_Page_080thm.jpg
7365572d1c86de663753de573b49b989
a0299c20c00adb7a6bd831a1a7759efb3f85027b
15438 F20101115_AABAPZ camacho_a_Page_029.QC.jpg
acc8a70d6a1cc70102c718e2ec9d3358
05f6bc3b4254c5156a5235bff3f21f039eed0c5d
3295 F20101115_AAAZSA camacho_a_Page_006.txt
e012f0afff6b68727070ff60ae12bb64
e90b4ee8c9dea5e56e20ae0877769bbbe5f54978
726005 F20101115_AAAZRL camacho_a_Page_045.jp2
d18ee2bf0ca58521feb0e9d0455e27ac
2f0bdb91587a04ab68bf8a05906960cb402f462e
62187 F20101115_AAAZQX camacho_a_Page_043.jpg
bef826b1104814b4567f158efdd87a54
acbbe87153da2a497d46a8bd11d5153c61b53751
7064 F20101115_AABAQP camacho_a_Page_082thm.jpg
e206181109d0af01e8b7225e4478f881
7fac31a83c0c36a02d413c5fa005d97e2da8d725
54791 F20101115_AAAZRM camacho_a_Page_024.jpg
0ae13c1c9567af2c710786fa990610cb
61c8e8d2f66578889f08726c853519bf8f05d24d
37125 F20101115_AAAZQY camacho_a_Page_047.pro
5ca61ebfcc3140f47ca74288dca33c38
bc3d6425307c1b758b35dbaa9bc9789c7fc37670
22462 F20101115_AABAQQ camacho_a_Page_099.QC.jpg
21157c2778f60e61823453951c2b6895
21890b5fe76d13cc8f9f10f6ed7e38d4b46b44e7
77112 F20101115_AAAZSB camacho_a_Page_055.jpg
1f41ddfc668d416be6edc53b17aa8518
2b277c3ec2712cac75558ea3f0ae70f5e71343b9
29656 F20101115_AAAZRN camacho_a_Page_059.pro
22e7d192c047805f1c6ebd76a033a6ad
579369abbf527bd5f74634a0197772ecf9b6696a
1468 F20101115_AAAZQZ camacho_a_Page_049.txt
24a5002e15add12d9077156b936ae8bd
8f77c4d9f771b1012e03d492b45c5c33b3380659
22842 F20101115_AABAQR camacho_a_Page_104.QC.jpg
97d44ea56ef81d5b7e7308e5e64efd81
c7d2d654f0cc42f52a408f31790d5511388f3adf
118227 F20101115_AAAZSC camacho_a_Page_072.jp2
8df8f13e208376c7577f3174197144ae
ef7272733bad2b6ce0999e3a0e2f5e24c598dbb9
4538 F20101115_AAAZRO camacho_a_Page_009thm.jpg
03e6e8141437d206de62c77081316639
5a71e8fd55bfc9b71f9dd4dd9c7ed833209478f1
30178 F20101115_AABAQS camacho_a_Page_110.QC.jpg
3deef18fd7e66adb62da0589485bb2a2
858295c41cdf8927eaa082c408e34b284e88e95b
4405 F20101115_AAAZSD camacho_a_Page_026thm.jpg
6e36ca29ac0c491a354795dc2e8946ef
af91c6d37bc4c34640b3e5348ef16916ff42f5d3
3892 F20101115_AAAZRP camacho_a_Page_084thm.jpg
db81e30b96193b52d2be844f69321e75
02e532a1fc9fdc2e2127bfb1bbc0444ca306b060
7250 F20101115_AABAQT camacho_a_Page_110thm.jpg
78c45d755a3d745c0aa7bf9efed42a8f
4b44c1ae290654812d833431c0782e3a4e51c87a
528142 F20101115_AAAZSE camacho_a_Page_060.jp2
58e67be90765adf288ad8d6ac12c599b
dc2fd7c29c63ebed5a58b0873460381cde61b8d7
18409 F20101115_AAAZRQ camacho_a_Page_090.QC.jpg
7f2529ab4264c86bce57caebe6b77577
83b5a4cd2f48c28e69510406de7a498d1cb315c0
26814 F20101115_AABAQU camacho_a_Page_111.QC.jpg
8299929890d3d4e3acbae23738790b03
dd17791f2dbe8f5832253c18b89be5181f374cf7
12137 F20101115_AAAZSF camacho_a_Page_010.QC.jpg
e0a7011ef6db9e6ecc486892afcdf509
3a68dd055f478505321ae723a5c20fdc659e0e08
2118 F20101115_AAAZRR camacho_a_Page_109.txt
f66e7a6bbb54c541064b9923fd5d6982
077eb251934475f80c3f8ab4439cb89415fd0105
27904 F20101115_AABAQV camacho_a_Page_113.QC.jpg
6b9bd4d7011f49e6385afa03ad77f075
ca7c3b3c9a37ba1a21e0505a0df4b4f357d054f5
19642 F20101115_AAAZSG camacho_a_Page_042.QC.jpg
aa588320d3014eec4afb88bb38cba664
b8be7a890db13e0ddd7abc20dfc17e88ce3978d3
555211 F20101115_AAAZRS camacho_a_Page_031.jp2
1b3f0719203a303bf30064b53cae1e2b
17d95248269a3fcde98acd44bdf23d9dfd932dc2
7349 F20101115_AABAQW camacho_a_Page_114thm.jpg
089ea0528010aab63df8474cf311f2ed
0d8c54d0901b596e1f8ff38b6bab677e04f3671a
4967 F20101115_AAAZSH camacho_a_Page_008thm.jpg
826b6e437b82be9d4a088f578e0a1564
0510b7ebd5dc730250c5cba2d6118e4b21ebaf7a
2188 F20101115_AAAZRT camacho_a_Page_100.txt
daf88e8936bf4c390ab19955a51b0ed8
c67a8e9051e262a1937e5c7edfa946d7ba533065
1369 F20101115_AAAZSI camacho_a_Page_042.txt
b2c4e29909b796504f11c651932f53c9
158b022b847c5d0de3705f3caa5ed837190d677b
F20101115_AAAZRU camacho_a_Page_002.tif
0bc827f70d003396bc9b9a9b86fc9b40
36ddc824432f6e4bbda8624713f433ee79fd7ffb
F20101115_AAAZSJ camacho_a_Page_056.tif
5da214dbc53c3127dc26f9425a55b2c1
02e482f9eda11fce13a72b7c53e9586bb3f04230
691170 F20101115_AAAZRV camacho_a_Page_028.jp2
f745cd97a632dfdb59bdfa1af17cae52
68032f0b1cdf438318588b961cc2498223b7f153
807326 F20101115_AAAZSK camacho_a_Page_018.jp2
ab0f0f2e2dd42931b612a2cc386f1cf2
ef1996e08dc5879ab71195e2bb18c68e2d277650
66912 F20101115_AAAZRW camacho_a_Page_086.jp2
7bc74bd1c09bdd2d91da0fb9829b7b45
0cdfc1ba4a5b66f4352b5878c88fc37e42d78235
6806 F20101115_AAAZSL camacho_a_Page_103thm.jpg
fdbc9751f56077ffdf1c3d63f271c68e
bbfd79948b409373e27663cbd5de9c40b11ead9d
1897 F20101115_AAAZRX camacho_a_Page_041.txt
0c3b53136ee18f9b3c3f7471a0367ad4
eceb58b94e9d90c8ac425e04273b3a2b438a97a7
F20101115_AAAZTA camacho_a_Page_059.txt
dae8f5a4e24b38c6b6d9580e49bfc85c
352e9f43bb5cf1741a9d34dc338a981e79287d44
2691 F20101115_AAAZSM camacho_a_Page_099.txt
5ec2e4a37f7656ac44d6651b7a213608
5c3a73bc61be686e44140660c4c9b539ddb9c87c
17329 F20101115_AAAZRY camacho_a_Page_022.QC.jpg
c44294caf3a09c6dddb81cbb1cfdb5d0
82fd8271ad54237ebba2c0e7d279955b6de0c29b
1421 F20101115_AAAZTB camacho_a_Page_045.txt
1a1e0e967b461b9d08cdd34c4e56ff9b
3c197a027475dd52e48242b95e81a243132fe738
5503 F20101115_AAAZSN camacho_a_Page_042thm.jpg
b337ef83fb496f26ea60d85b28ea9029
ae1739e199dfe6c9c14f8370a2b2749de614d9ea
5170 F20101115_AAAZRZ camacho_a_Page_028thm.jpg
73206fc62afefef22151c91d947494e5
3fcda8144dee5fe26855b4d6a6b24809f2feb33e
2384 F20101115_AAAZSO camacho_a_Page_101thm.jpg
38c75690eef1de7a543cec6d8e36ee27
c03853935c75c0f631ba7a69cd204a053e135885
31563 F20101115_AAAZTC camacho_a_Page_050.pro
da301cbbfede5c82ac2671e6dfb90dcb
a0ba844a1a3db9d375eac7f06c7601cb73aed22e
61280 F20101115_AAAZSP camacho_a_Page_100.jpg
315891bf061e0a33155d9dd0a5d751a0
fa6fe556d39fdc5678d75eb3d9775e9269bd9cd7
65806 F20101115_AAAZTD camacho_a_Page_008.jpg
e466fb6af6c09838a1fa0ae48574999f
febebddba6aef1b8f2f5d2cbc77bcd2de714984d
6722 F20101115_AAAZSQ camacho_a_Page_017thm.jpg
85d915b671b411524df120eadcfc896f
080bbed4164017e1139a4ae5be28738849ad4657
26638 F20101115_AAAZTE camacho_a_Page_115.jpg
7d7a984a7b033dcd5ab8f9e007b9e0b2
dce648f0c927fe2024f2bfe44c1179b988fdb6a4
1300 F20101115_AAAZSR camacho_a_Page_090.txt
8a140e1867d4b23e900d5462b286881b
7c057a299c4f03fccdce69c8395db017b1690234
25763 F20101115_AAAZTF camacho_a_Page_082.QC.jpg
045a65a9778e1f491109857f41868a13
6b1fe92af78fc34ca2b086eb3f76c206cc632639
50181 F20101115_AAAZSS camacho_a_Page_038.jpg
0b78c6e3c49616d4862767e5cd23c260
3d3da40fd264486f25fd88c476810a953cbc3886
76781 F20101115_AAAZTG camacho_a_Page_090.jp2
09c303d4752cd8b03446875ad57fec84
735573f8cc94b868936f13b5e68f517bdcc85186
828945 F20101115_AAAZST camacho_a_Page_077.jp2
d79c94855ac1bea109b7afae5c4a5bf6
5cb490b272c9212aac0b96c5b9fc1a413d35ec26
20211 F20101115_AAAZTH camacho_a_Page_078.QC.jpg
8a8df2292462b70f1972cc8810a5a5d1
73b7bf5bb96e0f60171270ac5832cced0f0989a7
121060 F20101115_AAAZSU camacho_a_Page_108.jp2
d41749ee2ac54716acc89996f76032fe
1593838bdda5943c74f15e698771787b85efde8d
84734 F20101115_AAAZTI camacho_a_Page_114.jpg
4b27e8f51b66aeb5c45cb3ee5df12f89
155bd4f6e39060d680a109124cedbbfd2930bff9
100184 F20101115_AAAZSV camacho_a_Page_057.jp2
04733b84287d8819b6e1d5e350247c0b
04762a3cc61bc876881c43206cf6bb29f752e677
F20101115_AAAZTJ camacho_a_Page_060.tif
4dc0a8472d44d5f8b1557bc6769626ee
601a5ce7f49333afc9c0d31e2dbe3df6d56ebcf6
29054 F20101115_AAAZSW camacho_a_Page_086.pro
e34f86783dde3d64747cce1333425d6d
286d2bc915aa50a0f23635a0b4ace128f22db522
750657 F20101115_AAAZTK camacho_a_Page_051.jp2
4de912d083c21effb0a7b8339a9bd3b1
fcd8a18a681a1661248b2c20be3d5428ed3308d7
F20101115_AAAZSX camacho_a_Page_057.txt
9dccff19aa932383919e621c945fd2b6
056da004b23802975958097e975b93a449c8fbc7
15880 F20101115_AAAZUA camacho_a_Page_012.QC.jpg
2fcc235d5e28cf2aba2d6dea04ee87bc
29575bc487a127f705cc297691942d1c67bed84c
57146 F20101115_AAAZTL camacho_a_Page_090.jpg
bd4f5e05b6644c8ea3a22850de5af6b6
f36022ab4533e9cca30077fa899313c304d0790a
88959 F20101115_AAAZSY camacho_a_Page_063.jp2
75a6fddc233493ce665779fa5f9b9f90
a76fdf885870b2c2e4dad73ae3ae17b344f85199
23584 F20101115_AAAZUB camacho_a_Page_060.pro
91d3bb261903c0d827f7d972667d6be5
0ed3e02381af2470f8abe5037e1f2279b0261660
103214 F20101115_AAAZTM camacho_a_Page_041.jp2
181c3706407e0fa3accaa9573b16bf49
f75fe3997cc03413f8dacd6c25092be66618fd49
5459 F20101115_AAAZSZ camacho_a_Page_093thm.jpg
e893d81b2248d825631c85076bf7d0da
8a084d9a7a85f272cb688c153769402eb0fc3919
1334 F20101115_AAAZUC camacho_a_Page_023.txt
3badc0b9a88b99fe3e2f7e3b1ddc2df9
5877d7780e39f28aaee435aa7816f33d8327238b
2224 F20101115_AAAZTN camacho_a_Page_072.txt
2c85d28e8a33d8a0b3839bb7ba65ce86
a5fe436b46fb1d21960d17ceedc378c7ca2b42c6
25751 F20101115_AAAZTO camacho_a_Page_101.jpg
c7cd7b5021b795aef9cbc7a6d11418b9
a6a65f30685d7c74ad1e257c9a811b09041de976
43494 F20101115_AAAZUD camacho_a_Page_039.pro
e5f33c4aec8e43fe34c26fbf2d963b88
c1e1926ae8548b2985a72989b7be527dccdb425e
19001 F20101115_AAAZTP camacho_a_Page_008.QC.jpg
865bf6e26a308c2f25e00dceeda403e5
11b4aa519933fe9b29a04c021aa714f9ff5627f0
19863 F20101115_AAAZUE camacho_a_Page_043.QC.jpg
9376ea923e388603a747135a26e364b9
3e76ea1de3096443e3aac2d8477901074c8a141a
35297 F20101115_AAAZTQ camacho_a_Page_061.jpg
d0965077e9429e98de9d9ada840f14ae
374ca17a67765324224f477cc9e8e481251b5a22
23530 F20101115_AAAZUF camacho_a_Page_103.QC.jpg
6bf88e3b67c2a23fa6de1fa4b0e06a17
348eaee51c68c900262186d88d5d85c9b1fb8bc8
2408 F20101115_AAAZTR camacho_a_Page_114.txt
4980331e7a6f43a9f50c6edc236c4db1
849a349ec24f1f70046c2acc08f7f3639ab5047a
54501 F20101115_AAAZUG camacho_a_Page_073.jpg
a1152510e5f8154a47bd4e02691f5a2a
986c2c3132242f4d21d4b56772743407fb4e8b8c
2085 F20101115_AAAZTS camacho_a_Page_089.txt
2494d397ca5ef51d032ed6635a893944
1c6e8bf2f028f29d34626f5f92604b71942849e7
117139 F20101115_AAAZUH camacho_a_Page_107.jp2
bcbf1a55c4b8c54fd392020e7690acea
13b051a271991c7305c0d197338af50dbe79c3eb
108803 F20101115_AAAZTT camacho_a_Page_103.jp2
ea6b4264c22f0eab33eb791eada5a5cf
2b769f7a1fcc0ad13770e8f4f5c5a30a47e87fbd
F20101115_AAAZUI camacho_a_Page_008.tif
d249da47a295cfc6d2f302a553543563
18f1146eb6c02cccb69bfc1e638621c79c498b5f
4864 F20101115_AAAZTU camacho_a_Page_079thm.jpg
83b1dc3fee5bd2f130b58cf59784f8f6
9d64a15e8fcc5f4b4e2f2cea3cf3fe0f306b3153
74190 F20101115_AAAZUJ camacho_a_Page_076.jp2
38585772a6cd20885d4b92a802452952
f089ad6955546b238a10f6aa6c23ccef2ac34d89
F20101115_AAAZTV camacho_a_Page_020.tif
c501ac855692b6b0e09d0f091d8736b1
59f0c9bfbf7478e38c69c2c20f1fa21808cc2c36
116107 F20101115_AAAZUK camacho_a_Page_099.jp2
13371617e0a66617008b551b56b547e8
dc8608fbc78be84092f28649a32af89580528914
18622 F20101115_AAAZTW camacho_a_Page_091.QC.jpg
6a164bab2b99472d832a0679f9f2bc34
5b5e0a29220c9d7d05b4a8a74754ed8d8abef095
667492 F20101115_AAAZVA camacho_a_Page_079.jp2
fb776b9fd7a48a20eb320816ea72b865
5eb7d67abb0e0d4658c153f8c21086507c9a73f5
76933 F20101115_AAAZUL camacho_a_Page_019.jpg
2e0f59ac50b207ba5ea4893290e31bb1
866a6fdd918d39edbc5080cba6d6323f3ff183f1
2724 F20101115_AAAZTX camacho_a_Page_115thm.jpg
35ddf56190a521673aa4cc14b7489925
e28ec5cbae675af5436f53e6f276d126f6844b5d
605340 F20101115_AAAZVB camacho_a_Page_052.jp2
a7f56efd41457671ebf96624a1308d4b
633604a70ec9ac4114b04a7d916e496a7da8e655
50140 F20101115_AAAZUM camacho_a_Page_017.pro
4f894b04bd5980baed527d7fd9763c32
3a26e71d2783fe24342c429e56d191e8b5a273d6
2818 F20101115_AAAZTY camacho_a_Page_004thm.jpg
0d03b3515534f75366e4e8f3179d78e6
96ec03820cd3543751c7aa8c41b61e66e22cc82f
1644 F20101115_AAAZVC object3-2.four_cycles_100hz_sawtooth_waveform.wav
f3de1344e16eb55f9ee6449db65c00bb
93cd2906c0a7eda0325dc937c3e7563f97794316
24383 F20101115_AAAZUN camacho_a_Page_068.QC.jpg
70ddbecb4983e06a7a97bf4f2ed201b1
8d11c24ccf34d0b065233b66039ed59eacbf0463
109281 F20101115_AAAZTZ camacho_a_Page_017.jp2
22bfb70654f826a2bcb1af6c34f1c413
a93f5245e6630005aabc1851e677f0bc3e7dfdd8
12635 F20101115_AAAZVD camacho_a_Page_011.QC.jpg
9da07b0fd0910f5b0c14cef293d8a816
545c65c4921aa9c29c32304900a448214d0e35be
F20101115_AAAZUO camacho_a_Page_067.tif
8eeaea5ee0d8856cd069805a07d456d8
1d08e5ecfee189c3797f4cb0cbb4efc858997721
5602 F20101115_AAAZUP camacho_a_Page_098thm.jpg
8cc06851c1d02ec9770a0481c2ae436c
3126b0b61530bd584c918760f18c18659eb19797
109329 F20101115_AAAZVE camacho_a_Page_016.jp2
84680465caf88e628b2c4a2bd2904847
7d6a3948006e21b0ad44ebe07f2557abd6101e9f
28130 F20101115_AAAZUQ camacho_a_Page_073.pro
df2c36ff350ef9c53933bf9fff07ba02
da371cd7be2f74c0b8661648b0de6953687799ed
F20101115_AAAZVF camacho_a_Page_065.tif
e15fab290bf5740346659e1b3b2a4c53
c0eaf5c6e2f44368f49546c847977f07d25bbd70
F20101115_AAAZUR camacho_a_Page_029.tif
795066729c485ad16725460371c0e464
d3d606f8ca8ce3913f1dea5739442664280c76b9
114866 F20101115_AAAZVG camacho_a_Page_055.jp2
4c0ba3e13dfdb6c55dcf0dbe0ff781f2
96d72f88308115e116f621388c8baf98e038aa86
F20101115_AAAZUS camacho_a_Page_080.tif
07ca575cf2edb8042d9fef412db356ca
dc356679d8c5eb56f443ea3248d6ea4bce833d76
81383 F20101115_AAAZVH camacho_a_Page_080.jp2
ec1f0140a12e4d304c3078dee5fed982
e5cc7d19b0627450c662b8366a1cb8a38e72cdca
1767 F20101115_AAAZUT camacho_a_Page_043.txt
1100a26ac3d2ae15621763f730443048
7b38392ef76311763cd1692aa059c9cdbe165476
830 F20101115_AAAZVI camacho_a_Page_011.txt
9c42c6970e48688de45f26b22592a6d6
19c8c8a1856161d3b1ed2391be2b4a412b0b4ac5
50594 F20101115_AAAZUU camacho_a_Page_048.jpg
73b6f707ff051173a26958ec3d0e1604
0f3b2e34ee4065e3e4b44911ab9b7e6425ae4552
5644 F20101115_AAAZVJ camacho_a_Page_085thm.jpg
b9094f7bc0fdc4b7a67fdd894ff60d67
17d34e56b6311a8bf620749ee3834670b212fb0e
60312 F20101115_AAAZUV camacho_a_Page_047.jpg
45e1f150b9ed2ed59a9a49eb5973899c
7ac4db00e68072fc0442985a50817c76116f398b
F20101115_AAAZVK camacho_a_Page_031.tif
91349d1d2c5a0a9e15c6ba27725b2382
1ca9095db3a38c3594670bf8755ecdf743b88bd9
21011 F20101115_AAAZUW camacho_a_Page_062.QC.jpg
d77e5c19492fd5c257231274347513d0
1514c8099d40c7553988174facb43d959e0f0a9a
32609 F20101115_AAAZVL camacho_a_Page_025.pro
16a529a434fba6955e388847d98918d5
8a9de3477ae3c9a9c6b4902ec9fd6a3b79d8e916
75589 F20101115_AAAZUX camacho_a_Page_013.jpg
da0a640f8d5ca6093c83c25acfee8c96
b711f702c8e2d61f823f779c32f73526f3fc2bd9
F20101115_AAAZWA camacho_a_Page_061.tif
2378958557f27e7461b54a5257084c5a
42a74ecbc7f4e6976e369c749288854254ff8088
19977 F20101115_AAAZVM camacho_a_Page_085.QC.jpg
0b1546f2356e74ed00aa105248b5c37d
4ed8939c4a1041f5b3e5680533f8c7b7633e95ea
759005 F20101115_AAAZUY camacho_a_Page_073.jp2
3a8e76dc2ed93c59f41541212d501523
a59a1b90d7b87017c1754e954ef25d0b4a8e97da
17406 F20101115_AAAZWB camacho_a_Page_028.QC.jpg
fe7b16b05f140dc8f3c3493f2607b910
0e8d0ec593b3a3163e2b2d3bf2d8ea8880d59699
600 F20101115_AAAZVN camacho_a_Page_004.txt
cc314f45b5fdf10aba5462d99eb072ad
b3b35446adc2a81145e92613ca14885b6d4d4c7d
80153 F20101115_AAAZUZ camacho_a_Page_072.jpg
1ffba53d14065276f7d22dbd75bf06ad
360ca0935ce53e66727a03580afce77624483843
7983 F20101115_AAAZWC camacho_a_Page_001.pro
ba39fb119b245a4af2430d97e39abe7d
31a9319ab22ffced8b92c4fb366a467d15b59f4d
44944 F20101115_AAAZVO camacho_a_Page_058.jpg
918a3d2923d1dd907c81b9c8a8cc38c2
a34b6a0024a026beef3f6384dfd8ae08e6baa9b3
46073 F20101115_AAAZWD camacho_a_Page_096.jp2
8fc393b90b37efd4d77d26dafb9a25a0
b47a3e18cc4927973999e99842400c4575fb8351
5357 F20101115_AAAZVP camacho_a_Page_020thm.jpg
bf3ce36bef9a6b9088cf5c0d9b7a11a0
a7b461b94e713dd96bc8ca48c29a9d528798c92a
596711 F20101115_AAAZWE camacho_a_Page_069.jp2
d940e6c8b074601ec522abffb52f950d
21781b996bb88f73bdfef42688e74baa22778cd7
F20101115_AAAZVQ object1-4.square_wave.wav
ffc75663a60afdc5b4ec1277f965c24e
0ca1c9c67c2db2b1c1bd4c505cc9ab84fea1e131
5849 F20101115_AAAZVR camacho_a_Page_078thm.jpg
7192cbfb5e24188b301557b4e497ceb8
81dc4438e3c6e5ca7fbefe9ca52692a8c37d0881
43389 F20101115_AAAZWF camacho_a_Page_094.pro
b5a39a8a27bb012fe00affd8b85ed01c
455ee9c6ee2cb3b197bc1e0e65ab92cbebf227de
F20101115_AAAZVS camacho_a_Page_104.tif
036f759d4ea8e95f2bd24cc98cb1920f
7dc5fff2c61ffe89f76aca611f34d57a2fb60380
59701 F20101115_AAAZWG camacho_a_Page_114.pro
4896499be6c2b3425798c0122adeacd8
0166db43f0d441ae8bf7a3934091dc92de6f790e
5976 F20101115_AAAZVT camacho_a_Page_006thm.jpg
7cb836bb8a1f6e8d2e6c763098e32ee4
dceee4a99f76858517d478ed74cbd81c8ee72e79
27937 F20101115_AAAZWH camacho_a_Page_049.pro
642b1bbad3d1e55f6ebd9bd138984dee
04bb22f3bbeac9182ee3ac6f283f056d24dbd6b6
F20101115_AAAZWI camacho_a_Page_111.tif
2c5f4dba31c08df2c4ce42ddb26d5a23
9cc09f42add69fea259541ac7121a457725e39a4
4649 F20101115_AAAZVU camacho_a_Page_086thm.jpg
f65d60cfd991b72fdc453971dff1d27e
56a28dd4d430ff693d0768f6a13abd9ab9a8bdf4
71127 F20101115_AAAZWJ camacho_a_Page_027.jpg
24d65097e10e60e9a04ebb45f6b8e441
51e2071290498accb619839aaa05a9162acef08c
1141 F20101115_AAAZVV camacho_a_Page_069.txt
51f483afed40bea6f93f4f230edc3c48
81ee663c39e3bc055878f17cfa60ad7d3eff9342
1051985 F20101115_AAAZWK camacho_a_Page_006.jp2
bcc85a0a9fc1cdb2df5edbfd41d243f8
0f643b7340016fb2e908d2cec3d41364f29a64a6
16004 F20101115_AAAZVW camacho_a_Page_086.QC.jpg
54a2650517b59959439db61871dde3e2
bfb31a07a41622a200388311f3c95bd883ced03d
33582 F20101115_AAAZXA camacho_a_Page_096.jpg
915edd7165602ef25bc5ce9a6f918eaf
a8f92778c657f2c08ab51e19b844259cb9edeeac
5364 F20101115_AAAZWL camacho_a_Page_018thm.jpg
ed93b17498483a2dc631c17bba95ade5
d6a424b29caa8b83b9c277f1344c832c6f369c1e
6895 F20101115_AAAZVX camacho_a_Page_056thm.jpg
d046965abe0fde2ca1d2cd51866da508
6d6af5ef11d21b26f188fc4ee2faec6dc6114c89
25638 F20101115_AAAZXB camacho_a_Page_074.QC.jpg
a354d0bac9e09e3ff1026062d7652c70
18ebd906c8fc3c2f54260c720bb9f843602ca0d7
2042 F20101115_AAAZWM camacho_a_Page_097.txt
f688e9cebcf8e61d750761803eb2df40
31ad03c82d8f2088fe5d4032896b28a721af6eda
5474 F20101115_AAAZVY camacho_a_Page_070thm.jpg
1a42870ad7b3ff054e20848c839cc74c
4bf946bc854e2750a370ad2abcbc2700010f4738
F20101115_AAAZXC camacho_a_Page_076.tif
c916912840b51ad768ab36306acf582a
3550c9e17673b5d467d71b01f98ffa9291906131
44882 F20101115_AAAZWN camacho_a_Page_092.pro
c1ca8eac63610e488aef27487772ec6f
3735e64b34e6694f887d864b4ab8813d70801c50
7260 F20101115_AAAZVZ camacho_a_Page_003.jp2
530361085fdc7af84eb7858817cc4737
682cd5e58a20792fb72f5d03d41b2dc429e8ae0c
68143 F20101115_AAAZXD camacho_a_Page_094.jpg
e06ad5169a79686a81d889e8d213413a
dddee50894204f7497681e1b061ccccea4b3d572
15020 F20101115_AAAZWO camacho_a_Page_009.QC.jpg
a3fa8b775b0f75e97a5c004098c7de44
4d87fd7fd8c009577ac6541d0af016e1ec9f65ae
F20101115_AAAZXE camacho_a_Page_107.tif
d3511956062beb74277b168190648162
797b6eaafe1009c683bf917b273fa36fc3e214cc
52968 F20101115_AAAZWP camacho_a_Page_052.jpg
5716d898460c6999f3ad67a106091d3e
e31bd6131ee88826b9e1736415599127adf9b8d8
F20101115_AAAZXF camacho_a_Page_045.tif
b4b47885312a45d52c635ee9ad6bcf01
8c48e5a63f5c993eac3a9d2e903455980b0624ab
82119 F20101115_AAAZWQ camacho_a_Page_116.jp2
7d6bd07c1296c0af9a245abdc77a2cc0
1be2830e2430b1225edde63aeb5284179882ce33
24946 F20101115_AAAZWR camacho_a_Page_087.QC.jpg
94b2ff73bdc318e3975c93a543b63c0b
40d588176a7dbed2b1e19aa4cb81d7ea3cd8d88c
F20101115_AAAZXG camacho_a_Page_066.tif
b88c36eb78c825cf63f23b0b3aca0a29
6869141195c7e515e0b7fd7c3f84f1a395faf5e9
3400 F20101115_AAAZWS camacho_a_Page_096thm.jpg
19b52bf61df9bfafc96c305f10e9f90a
515e63aa5bc860449cbaeb094f99a90c66b11d4d
F20101115_AAAZXH camacho_a_Page_019.txt
e9cec20c57ee0d8dc83335c8b919caf0
baa494fedac6ebccfe99ce506075bff4f624b371
108723 F20101115_AAAZWT camacho_a_Page_032.jp2
02e07f48e299cade070f5e77a5c1296a
b1dca46884c6c098f3be8d232537ca036d9efce8
55468 F20101115_AAAZXI camacho_a_Page_107.pro
bf60c2f29e3fa4fb495ccf4685d1b5d9
a2aa114a4658890fd1aebf784d74aa2a04aa568a
72318 F20101115_AAAZWU camacho_a_Page_088.pro
712f0ac8efd7c77544243ccf9ebd0f27
9f35bf795c6e4d5e897057918dfa4dca0c5fc4fe
24275 F20101115_AAAZXJ camacho_a_Page_001.jp2
47a65ce42c01066185bf4e2ac6eb425b
153026cad7c02abd73f971d1366606f8256ecab5
63253 F20101115_AAAZWV camacho_a_Page_075.jpg
282198a029e6c15400fc02b8621b3156
196a1c2ede6bc6bdefe14cedebd08d3ecf326b13
8423998 F20101115_AAAZXK camacho_a_Page_093.tif
87481e63727a20094259c30ebdeb484d
f410a6f7bf4edf20d3cb14437588580ae29b2afa
644940 F20101115_AAAZWW camacho_a_Page_034.jp2
62e46d9e3d4f1841a83368c344cdc2b5
59c1ec0211115bd60dc30ee7c8b5cfb60d9b44b2
26526 F20101115_AAAZXL camacho_a_Page_021.QC.jpg
e297b9ef04fa5fb0e65643f512fb847b
55fb3e96ed83e7425bc093bc8969bd9176418298
5334 F20101115_AAAZWX camacho_a_Page_091thm.jpg
c2df0741db0db4d2573bf83ac35882c1
bcf50e3cefea06086d4cb10b61a1909321f122e1
84719 F20101115_AAAZYA camacho_a_Page_005.jpg
fb8eb4d02a1b330427b092f3fe11916f
84324e71d9b987d171f46be328dbbcd896c46a7a