An analysis of prosodic systems in the classroom discourse of native speaker and nonnative speaker teaching assistants


Material Information

An analysis of prosodic systems in the classroom discourse of native speaker and nonnative speaker teaching assistants
Physical Description:
x, 315 leaves : ill. ; 29 cm.
Pickering, Lucy, 1966-
Publication Date:


Subjects / Keywords:
Linguistics thesis, Ph. D   ( lcsh )
Dissertations, Academic -- Linguistics -- UF   ( lcsh )
bibliography   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph. D.)--University of Florida, 1999.
Includes bibliographical references (leaves 305-314).
Statement of Responsibility:
by Lucy Pickering.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 030479932
oclc - 43303984
System ID:

This item is only available as the following downloads:

Full Text








ABSTRACT . . . iv



Overview .. . . 1
L1 and L2 Discourse. . .. 10


Models of Discourse Prosody . .. .16
A Model of Intonation in Discourse. .. .18
Comparison with Two Models of Intonation. 37
Additions to Brazil's Model . .. .47
Conclusion. . . .. .53

3 METHODOLOGY . . .. 55

Database . .. .. 55
Data Collection and Analysis. . 57
Transcription Conventions . .. 64


Introduction. . . 65
Sequence Structure Structure . 67
Pitch Sequences and Discourse Markers ... .80
Tone Choice and Orientation . .. .94
Conclusion. . . .104


Introduction . ....... ..107
Sequence Chain Structure. . .109
Pitch Sequences and Discourse Markers .135
Tone Choice and Orientation ... .153
Conclusion. . . .168


Introduction .......
Sequence Chain Structure . .
Pitch Sequences and Discourse Markers .
Tone Choice and Orientation . .
Conclusion . . .



7 CONCLUSION . . ... .. 242

Summary of Analyses . .. 242
The Role of Prosodic Stucture in Discourse 253
Suggestions for Future Research ... .258

APPENDICES. . . ... 265





REFERENCE LIST . .... .305



Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Lucy Pickering

August 1999

Chairperson: Diana Boxer
Major Department: Program in Linguistics

This dissertation investigates the role of prosodic structure in the

classroom discourse of native and nonnative speaker teaching assistants in

one American university. Video and audiotaped data of naturally occurring

teaching presentations given by male North American, Chinese, and Indian

English speakers were collected in the classroom. Fundamental frequency

contours and pause structure were calculated using a Kay Elemetrics

computerized Speech Laboratory. Patterns of intonation, stress, and

pausing were then interpreted using a model of intonation in discourse.

The results of the native speaker analysis show that intonation and

pause structure are organized systematically by these speakers both to

structure information (for example, to mark topic boundaries and establish

contrasts), and interactively to establish a rapport between discourse

participants. The results of the two nonnative speaker analyses show that

both groups could be characterized by a typical prosodic profile which

marked speakers as deviating from a native speaker standard. Typical

pitch and pause patterns found in these data show little indication that

teachers are directing their presentation towards assisting the students in

their comprehension of the material. Conflicts between prosodic cues and

organization at other levels of the discourse (for example, topic

organization or syntactic structure) make the informational structure of the

discourse more difficult to interpret for the native speaker hearer. In

addition, intonation choices are shown to contribute to a distancing

between teachers and students. At an interpersonal level, they frequently

characterize teachers as uninvolved and unsympathetic from the

perspective of native speaker participants in the discourse.

The study concludes that prosodic structure forms a natural link

between grammatical and soicolinguistic competence and bears a high

communicative load in terms of both structuring information and expressing

relationships between participants. Therefore, prosodic miscues in

nonnative discourse will negatively effect undergraduate perceptions of the

nonnative teachers' competence and personality and are one underlying

cause of cross-cultural communication failure between international teaching

assistants and their students.



Over the last decade, both university faculties and graduate

programs have become increasingly diverse. The numbers of

international teaching assistants and lecturers in scientific and technical

fields such as engineering, mathematics, and laboratory sciences have

increased dramatically (Mooney, 1990). The majority of U.S.

undergraduates are now more likely to have important contact with

international staff in their introductory courses, and nonnative speakers

are required to be "professional communicators" on a daily basis in

their classrooms (Scollon & Scollon, 1995). As with many other

workplaces, cross-cultural communication has become an integral part of

academic life in universities across the country. However, communication

failure between nonnative teachers and their students is not uncommon,

and concern regarding the competence of international staff remains

acute (Cresswell, 1990). Increasingly, screening programs developed to

assess the linguistic ability of international teachers have recognized

that successful communication between language groups requires a

sophisticated communicative competence on the part of the nonnative

speaker. This includes the ability to use language appropriate to a

given situational context, and to recognize the expectations of native

speaker discourse participants. One area of linguistic competence which

is frequently overlooked in this discussion is the prosodic structure of

nonnative discourse.

This dissertation investigates the contribution of prosodic

structure to possible cross-cultural communication failure by analyzing

the systematic use of two prosodic variables, pitch variation and pause

structure, in the naturally occurring discourse of native and nonnative

teaching assistants. Using native speaker presentations as baseline

data, the analysis focuses on the role of discourse prosodics in typical

classroom presentations and in rapport-building between teachers and


In comparison with analyses of the syntactic and lexical features

of text, the contribution of the prosodic characteristics of longer

stretches of speech has remained largely understudied (Levinson, 1983).

Systematic investigation of the role of intonation, in particular, has also

been hampered by its traditional representation as a "half-tamed

savage" (from Bolinger, 1978, cited in Valssiere, 1995), lying on the edge

of language and more appropriate for paralinguistic investigation. More

recently, however, improvements in the instrumental techniques available

to researchers in speech perception and new approaches to discourse

analysis have resulted in a revised conception of the role of prosody in

the production and interpretation of spoken discourse. Prosodic

features such as stress, intonation, rhythm, and pause structure have

been shown to form a natural link between linguistic and sociolinguistic

aspects of language, as they bear a high communicative load in terms of

both structuring information and expressing relationships between

discourse participants (Brazil, 1997; Gumperz, 1982). In light of this

research, prosodic features become measurable as a critical component of

the communicative ability of nonnative speakers, as they directly impact

linguistic, sociolinguistic and discourse competence.

The role of discourse prosodics in information structuring has

been investigated in a number of experimental studies which propose

that prosodic features such as pitch (as measured by fundamental

frequency) and pause structure are used in the production and

processing of local (utterance level) and global (discourse level)

information structure (Grosz & Sidner, 1986). Production studies in

English and Dutch show that a speaker's use of pitch and pausing can

be directly linked to the topic structure of the discourse (Grosz &

Sidner, 1986; Nakajima & Allen, 1993; Swerts & Geluykens, 1993, 1994;

Cutler, Dahan & Donselaar, 1997). Speakers tend to use a high pitch

level, or fundamental frequency (Fo), at the initiation of a new topic, a

mid level at points of continuation, and a low Fo accompanied by longer

pauses at topic final boundaries. Nakajima & Allen (1993) also found

that topic elaborations or 'asides' were produced with lower Fo onsets

and finals and were characterized by a restricted pitch range. Swerts

& Geluykens (1994) conclude that "this points to a very sophisticated

use of global Fo features by the speaker, and shows that we should

look beyond the local level when studying the discourse function of Fo

variation" (p. 31).

Listener perceptions of the role of prosodic cues in information

processing are typically tested using response times to manipulated or

synthesized speech (Kreiman, 1982; Grosz & Sidner, 1986; Grosjan, 1983;

Swerts & Geluykens, 1994; Cutler, Dahan & Donselaar, 1997). In these

studies, listeners were able to identify major discourse boundaries and

predict when an utterance was likely to end using only prosodic

features such as pause length and Fo variation. When syntactic and

prosodic cues were manipulated so that utterances were syntactically

complete but prosodically incomplete, listener response times increased.

suggesting that this mismatch of linguistic signals required listeners to

reanalyze the information (Berkovits, 1984; Sanderman & Collier, 1997).

Swerts & Geluykens (1994) conclude that "listeners are able to deduce

discourse structure from prosody. Both pause duration and pitch

variation appear to be important perceptual cues" (p. 38). Collectively,

this research suggests that speakers employ prosodic structure to

organize information at a global level and that listeners use prosodic

cues to parse incoming information and predict upcoming discourse


In addition to these informational functions, discourse analysts

have proposed that pitch variation and pause structure form part of a

systematic use of prosodic features for indexical, or non-referential,

functions (Gumperz, 1982; Couper-Kuhlen & Selting, 1992). Indexical

functions include the use of pitch variation to regulate turn-taking in

conversation, to communicate sociolinguistic information such as status

differences, solidarity, or social distance between interlocutors, or to

project speaker assumptions regarding what information is 'new' or

shared in the context of a specific interaction. In general terms,

prosody contributes to relationship-building between participants. Both

the referential and non-referential functions of prosodic structure are

united in Gumperz's (1982) theory of conversational inference.

Gumperz suggests that comprehensible spoken discourse is

achieved through the production and interpretation of multiple cues or

signals present at all levels of the discourse, i.e., lexical, syntactic,

prosodic and non-verbal. The pragmatic or communicative value of the

discourse message is contained within the composite whole. For their

production and interpretation of these devices, or contextualization cues,

participants use "contextual presuppositions" (institutionalized linguistic

and cultural knowledge), and "situated inferencing" (moment by moment

inferences regarding the speaker's intent based on the context of the

interaction). Gumperz proposes that over time, these cues have become

tacit, conventionalized choices, and in normal interaction between

members of the same speech community, discourse participants will

implicitly assume a shared framework of production and interpretation.

The reliance on a shared linguistic and sociocultural background

for interpretation of the discourse message has particular implications

for cross-cultural communication. The way in which participants orient

themselves to the interaction and to each other depends on their on-

going interpretation of conversational behaviors. Those behaviors that

differ across speech communities may not be immediately evident to

interlocutors, as interpretation rests on deeply rooted, culturally based

presuppositions which are not easily retrieved by a native speaker on a

conscious, analytical level. Participants are likely to assume a mutual

understanding of discourse conventions, and infer speaker intent within

their own interpretive framework (Green, 1989; Humprey-Jones, 1986;

Tannen, 1985). Prosodic cues are particularly vulnerable to

misinterpretation. In Gumperz's (1982, 1983, 1992) own work

investigating interactions between Indian English speakers and

British/American English speakers, he shows that Indian English

prosodic conventions frequently lead American/British participants to

view Indian speakers as discourteous, aggressive and misleading.1 In

light of the double function of prosodic cues in both structuring

information and rapport-building between participants, Gumperz et

al.(1984) characterize intonation as "among the most important of the

devices that accompany cohesion in spoken interaction" (p. 5).

This dissertation extends the current research in both speech

analysis and cross-cultural communication concerning the role of

prosody in discourse. The study compares two prosodic features, pitch

variation and pause structure, in the teaching presentations of native

and nonnative speaker (Chinese and Indian) teaching assistants in an

American university. A qualitative design was chosen in order to

conduct a microanalysis of the complete pitch and pause structure of

each of the discourse extracts recorded for this study. Fundamental

frequency contours and pause lengths were computed for each extract

using a Kay Elemetrics Computerized Speech Laboratory. These data

1 Although the models of discourse and intonation structure used in
this study (Gumperz, 1982: Brazil, 1997) are based primarily on
observations from standard British speakers, both researchers have also
used American English examples. As the formal constructs proposed in the
models and their interpretative value were found to be equally applicable
to the standard American English speakers investigated in this study, I
will subsume both the standard models of American and British English
under the title 'English' throughout this study for ease of exposition. This
term is contrasted with 'indigenized varieties of English' which is used to
describe Indian English. I also note, however, that there may be
differences in the interpretation of certain intonational features based on
localized regional or social factors, in both American and British English
(see, for example, Local, 1985; Bolinger, 1989), or in other native standard

were then analyzed using a model of intonation structure in discourse

proposed by Brazil (1997). Brazil's framework comprises a series of

formal intonational categories which operate at the same level of

abstraction as syntactic and lexical choices, and have independent

implications for the discourse structure. Both Gumperz's and Brazil's

proposals share the same underlying principles regarding the

communicative function of intonation. Central to Brazil's model is the

principle of a state of convergence between discourse participants; that

is, the continuous negotiation toward a roughly mutual state of

understanding in the immediate and constantly changing world of

naturally occurring spoken discourse. Intonational choices made by the

speaker project both referential and non-referential information which

the hearers will interpret within their understanding of the how the

system operates in English.

The comparison of the native speaker (NS) and nonnative speaker

(NNS) prosodic data is set within this larger framework of discourse

interpretation. If it can be established that native speakers are using

prosodic cues to orient their hearers to the interaction, then analysis of

the nonnative data can determine whether prosody is used by these

speakers to transmit the same information. In addition, the formal

categories proposed by Brazil constrain the hearers' interpretation of

particular pitch movements. Therefore, we can surmise what effect

specific prosodic miscues in the nonnative speaker discourse are likely

to have on the comprehensibility of the discourse and rapport-building

between teacher and students.

The study focuses on four principal research questions:

1. Based on a model of prosodic structure in American/British
discourse, is there evidence that native speaker teaching
assistants systematically pattern intonation and pause
structure for informational and social functions for the benefit of
their hearers?
2. Based on an analysis of parallel native speaker and nonnative
speaker teaching presentations, what similarities and differences
in prosodic patterning are found in the teaching discourse of
Mandarin Chinese ITAs?
3. Based on an analysis of parallel American English and Indian
English teaching presentations, what similarities and differences
in prosodic patterning are found in the teaching discourse of
Indian English ITAs?
4. Based on these analyses, is the prosodic structure of ITA
discourse likely to be a cause of miscommunication at
informational and social levels between ITAs and their American
English hearers?

The discussion also addresses issues which evolved naturally out

of the analysis, such as differences between Indian and Chinese

speakers' use of English prosodic systems, development of prosodic

features in a second language, and how these results can be applied to

cross-cultural communication and ESL pedagogy.

The remainder of this chapter examines recent literature

concerning the prosodic structure of L2 discourse, and ITA discourse

specifically. Chapter 2 describes Brazil's model of intonation in

discourse in detail. Currently, there are several models of prosodic

structure in English discourse available to the researcher

(Pierrehumbert & Hirschberg, 1990; Watt, 1994; Brown, Currie &

Kenworthy, 1980; Halliday, 1967). Chapter 2 also includes a comparison

with two similar models, and a discussion of why Brazil's model was

considered to be most appropriate for this study. Finally, Chapter 2

describes two additions that have been made to Brazil's model for the

purposes of this study. The first is a unit of intonation structure

developed by Barr (1990), which formalizes a prosodic paragraphing

structure found in the lecture discourse of native speakers. The

second is the inclusion of pause analysis based on previous findings

regarding the prosodic features of typical NNS teaching discourse

(Rounds, 1987). Chapter 3 describes the data and the procedures used

in this dissertation. The chapter includes a discussion of the

instrumental techniques used in the data analysis, and examples of the

fundamental frequency read-outs used to illustrate pitch variation in the


Chapters 4, 5, and 6 comprise the results of the study. Chapter 4

reports the native speaker data analysis. The results verify that NS TAs

make systematic use of prosodic cues to communicate the global

structure of the discourse and to project their assumptions regarding

the knowledge state of a particular group of hearers. In addition, these

speakers use certain intonation choices to create solidarity with their

hearers by acknowledging their participation In the discourse. Chapters

5 and 6 report the results from the nonnative data. The analysis of the

Chinese ITA data, given in Chapter 5, shows that these speakers fail to

make systematic use of prosodic cues for referential discourse functions.

Furthermore, there was little evidence in these data of the use of

prosodic cues to build rapport between teacher and students. Indeed,

intonation patterns were typically found to exclude the hearers from the

context of the interaction. The results of the Indian ITA analysis are

reported in Chapter 6. There was more within-group variation in these

data, possibly related to the speakers' different L1 backgrounds.

However, the analysis suggests that as a group, these speakers use

certain conventionalized prosodic patterns that have been transferred

from General Indian English. For the American English listener, these

patterns frequently obscure the informational structure of the discourse

at both a local and global level and reduce comprehensibility. There is

also less evidence of the use of rapport-building strategies by this

group of TAs in comparison to the NS group.

Lastly, Chapter 7 presents a summary of the three analyses, and

discusses the role of prosodic structure in the comprehensibility of L2

discourse in light of the results of this study. I assert that prosodic

cues are a critical component of comprehensible spoken discourse in

English, and should be viewed as of central importance to the

development of effective discourse competence in L2 learners. The

chapter concludes with suggestions for future research, and the

possible applications of this kind of analysis.

L1 and L2 Discourse Structure

Comparative studies of L1 and L2 discourse structure demonstrate

crucial differences in the production of prosodic cues by L2 speakers

which can negatively affect the interpretation of discourse structure by

NS hearers. Current research suggests that nonnative-like prosodic

structuring in NNS discourse contributes to a lack of cohesion at a

global level, confusion regarding the relationships between individual

propositions at a local level, and misinterpretation of speaker intent at

an interpersonal level (Wennerstrom, 1997; Hewings, 1995; Anderson-

Hsieh, Johnson & Koehler, 1992). In investigations of advanced and

intermediate Asian and European learners, Wennerstrom (1994, 1997)

found that speakers did not use pitch variation to signal new or

contrastive lexical items, and used less reduction of pitch on non-

prominent words. This led to multiple prominences in an intonation unit

and difficulty in distinguishing sentence accent. Japanese, Thai, and

Chinese speakers also tended to use low boundary tones between related

propositions where rising or mid level tones would be anticipated by NS

hearers. Pirt (1990) reported similar results in a study of Italian

learners. In addition to multiple prominences, she found more use of

level and falling unit final tones in the NNS data, indicating that

learners were 'language-oriented' rather than oriented toward their

hearers. Lower proficiency learners also used inappropriate low

boundary tones such as the following (capital letters indicate prominent

syllables, and // indicates the boundary of an intonation unit):

// you must the FIRST RIGHT// (p. 151)

Hewings (1995) found a similar preference for the use of falling

tones in the discourse of advanced L2 learners from Korea, Greece and

Indonesia. This was particularly problematic in situations where rising

tones were chosen by native speakers for "socially integrative"

purposes. Hewings reports that when contradicting a previous speaker,

NSs consistently used rising tones to avoid the appearance of overt

disagreement implicit in a falling tone. In agreement with Gumperz,

Hewings suggests that the use of falling tones by NNSs in this context

can give the impression of deliberate rudeness or animosity on the part

of the speaker.

Studies investigating fluency in L2 discourse (i.e. pause structure

and hesitation phenomena) suggest that this can also confound listener

interpretation of the discourse structure. Typical characteristics of NNS

speech such as repetition or correction of lexical items, and

retrospective drafting of entire phrases (Hewings, 1990)2, disturb the

prosodic composition of the discourse and make it more difficult for the

hearer to retrieve the overall informational structure. Similar

difficulties have been shown for pause structure in L2 discourse.

Riggenbach (1991) and Anderson-Hsieh & Venkatagiri (1995) found that

there were more nonlexical fillers and unfilled pauses in non-fluent NNS

speech, and that long pauses frequently appeared within intonation


These characteristics affect NS perceptions of both internal

cohesion and overall coherence of the discourse structure, and listener

perception studies suggest that NS hearers react in a number of ways.

Difficulties in processing information structure may necessitate hearers

"replaying" parts of the message (Munro & Derwing, 1995). This, in

turn, can lead to "listener irritation" (Eisenstein, 1983), a dual response

to NNS discourse consisting of a negative cognitive reaction to reduced

comprehensibility, and a negative emotional reaction due to annoyance

and distraction. Problematic pause and pitch characteristics in

discourse production have been directly linked to listener irritation in a

number of experimental studies (Brown, Strong & Rencher, 1973, 1974;

Philipson, 1978; Fayer & Krasinski, 1987; Holden & Hogan, 1993). In

summary, the prosodic features of NNS discourse clearly contribute to

what Bouchard-Ryan (1983) calls a "generalized negative affect", which

2 The following is a typical example of retrospective redrafting taken
from Hewings (1990):
/ER// HE-er// he BREATHED-er// ER// MUCH-er// s-s-// SO MUCH-
er//ER// AIR// he BREATHED in// SO much AIR// (p. 143).

describes the negative judgements made by NS hearers concerning the

speaker's competence and personality.

Turning now to the ITA literature, although prosodic structure is

directly addressed in very few studies, where observations are made,

they reflect the findings In the L2 literature. Rounds (1987), Byrd &

Constantinedes (1990) and Bailey (1984) show that pause structure and

rate of speech can negatively affect intelligibility of the discourse and

student perceptions of the ITA. In Hinofotis & Bailey (1980),

undergraduate students were asked to comment on ITA presentations.

The most frequent complaint was that ITAs were boring, and it was

difficult for students to concentrate. The authors link this remark to

the monotonic intonation patterns that characterize the presentations.

The underlying problem reflected in these kinds of comments is listener

perception of a "flat, undifferentiated, amorphous structure" (Tyler,

Jefferies & Davies, 1988), created in part, by frequent silences and a

lack of prosodic cues to signal information structure.

Many of these observations are consolidated in a group of studies

conducted by Tyler and her associates (Tyler, Jefferies & Davies, 1988;

Davies, Tyler & Koran, 1989; Tyler & Davies, 1990; Tyler, 1992; Tyler &

Bro, 1992; Tyler & Bro, 1993; Tyler, 1995). Working within Gumperz's

model of cross-cultural communication, these researchers use

microanalysis of ITA presentations and teacher-student interactions to

illustrate how an accumulation of miscues at all levels of the discourse

structure can result in a misinterpretation of speaker intent by

undergraduate students. Tyler, Jefferies & Davies (1988) show that

prosodic miscues such as inappropriate falling contours, multiple

prominences and disfluency, combine with problematic syntactic

structures and use of discourse marking to obscure informational

structure. In Tyler & Davies (1990), both ITA production and

interpretation of prosodic cues contribute to communication failure

between a Korean ITA and a US undergraduate student. As the

interaction progresses, it is clear from the student's agitated tone and

higher pitch that he is becoming increasingly more distressed. However,

during a playback session of the interaction, the ITA told researchers

that "he was not confident about reading the information conveyed by

prosodics and tone" (p. 404), and therefore, did not adjust his approach

to the student. These studies further highlight the critical importance

of situational context. Classroom interaction is an example of "binding

discourse" (Goffman, 1981), i.e., talk that "supports a class of hearers

who are more committed by what is being said" (1981: 140).

Undergraduate students are primarily concerned with their ultimate

success in the class, and may be less tolerant of communication

difficulties in this environment than they would be in some other

situational context. With this added consideration, the ability of ITAs to

both successfully produce and interpret prosodic cues in discourse

becomes a necessary component of their overall communicative


The design of this dissertation study is consistent with the

qualitative, interpretive investigations of ITA discourse conducted by

Tyler et al., and augments these earlier studies by demonstrating how

global prosodic organization of the discourse can contribute to the

typical cross-cultural communication problems found in many teacher-

student interactions. This dissertation shows that if we do not address

prosodic structure in ITA discourse, we are essentially disregarding an

entire level of discourse organization and access to a tool used

consistently by native speakers to build a positive rapport with other

participants in the discourse. Through comparison with baseline native

speaker data, this study demonstrates how prosodic miscues in the

nonnative speaker discourse can be integrated into an overall

assessment of L2 competence and the ability of L2 speakers to

communicate effectively with native speaker interlocutors.


Models of Discourse Prosody


The previous chapter argued for a discourse framework in which

comprehensible spoken discourse is achieved through the interpretation

of multiple cues present at all levels of discourse production. This

interpretation is based on both the shared linguistic and sociocultural

backgrounds of participants and the situated context of any given

interaction. It was further proposed that prosodic cues contribute

independently to the message contained within the discourse as a whole,

serving to both structure information and establish the relationship

between discourse participants. This chapter introduces a model of

intonation in discourse (Brazil, 1985 and 1997) compatible with this

framework of discourse production and interpretation.

The first part of the chapter gives a full description of Brazil's

model based largely on the work of Brazil (1997) and Brazil, Coulthard &

Johns (1980). Where issues addressed by the model parallel discussion

in the collective literature concerning prosodies, this will be indicated in

order to clarify Brazil's theoretical position within the larger framework

of other important work in the field. The second section compares

aspects of two other models of intonation (Halliday, 1967; Pierrehumbert

& Hirschberg, 1990) to Brazil's proposals. Halliday's model precedes

many treatments of intonation analysis that employ tone unit division

and tonal analysis of tonic syllables, including Brazil's model. However,

the discussion will assert that Halliday's reliance on syntactic and

information structure and attitudinal meaning to explain phonological

form unnecessarily complicates the tonal inventory. The Pierrehumbert

and Hirschberg model is analogous to Brazil's proposals in that it

develops a system based only on phonological form and assigns an

independent pragmatic function to intonation structure. However, I

suggest that the interpretive model they have developed up to this

point offers less insight regarding intonational effects across tone unit

boundaries and, therefore, is unable to investigate the larger patterns

of intonation structure in discourse suggested by Brazil and other

researchers. On this basis, it is argued that Brazil's model provides the

most comprehensive framework in which to investigate an independent

intonational structure of discourse as opposed to sentential or clausal

based units.

The final section in this chapter incorporates two additions to the

model. The first is a unit of intonation structure operating in discourse

proposed by Barr (1990). As Barr is working both within Brazil's model

and with teaching discourse, the investigation of these units has been

added to this analysis. The second is the addition of pause analysis to

Brazil's original work with stress and intonation. Prior research

investigating the teaching discourse of nonnative teaching assistants

(Rounds, 1987) suggests that an analysis of pause structure highlights

important qualitative differences between the overall prosodic structure

of NNS and NS discourse that can affect comprehensibility and

relationship-building in the classroom. Finally, a summary of the revised

model will be given at the end of the chapter.

A Model of Intonation in Discourse

Brazil (1997) proposes that intonation structure directly

contributes to the pragmatic message of the discourse by the use of

intonational cues to link the information to a world or context the

hearer can make sense of. The speaker chooses from a series of formal

options which operate at the same level of abstraction as syntactic and

lexical choices and have independent implications for discourse

structure. The speaker's choices project a context of interaction based

on the on-going situated context of the discourse and her assessment of

the hearer's knowledge state. As this context is constantly changing,

intonation choices are relevant only at the moment of speaking, and the

speaker is involved in a continuous assessment of the relationship

between the message and the hearer. Within the context of any given

interaction, the participants are in the process of negotiating a "common

ground" or background to which "new" or unknown information is

added, contributing to the structure both within and between

intonation units. It is this negotiation toward a state of convergence (p.

133), a roughly mutual understanding of what is being said in the

discourse, that allows for successful communication between participants.

The formal options through which this negotiation is realized are

described below.

In the tradition of functionally based descriptions of English

intonation (Halliday, 1967; Crystal, 1969; Watt, 1994; Tench, 1996), Brazil

adopts pitch defined tone units as a means of breaking up stretches of

spoken discourse. Each unit has a possible 3 part structure; however,

only the tonic segment, the actual meaning-bearing element, is

obligatory; therefore, a minimal tone unit consists of only a tonic

segment, while an extended unit contains additional proclitic or enclitic

material. Examples of minimal and extended tone units are shown on

Table 2-1 below.

Table 2-1. Examples of Minimal and Extended Tone Units

SEGMENT (optional) (obligatory) (optional)
you can DRAW the GRAPH now

Tone unit boundary recognition is frequently discussed in the

literature (Crystal, 1969; Brown, Currie & Kenworthy, 1980; Cutler, Dahan

& Donselaar, 1997) and there is general agreement that boundaries can

be detected using a number of phonetic criteria such as vowel

lengthening, changes in pitch direction or short pauses. It is also

recognized however, that such boundaries are not always easily

identifiable (Tao, 1996; Couper-Kuhlen & Selting, 1996). While Brazil uses

phonetic criteria where they are present, one of the advantages of the

model is that it does not require precise recognition of unit boundaries,

as no linguistically significant contrasts are made on the optional

proclitic and enclitic segments. Tonic segment boundaries are identified

by the feature of prominence, fundamental frequency (Fo) excursions

which distinguish prominent syllables from the surrounding content and

represent the speaker's assessment of the relative information load

carried by the elements in the utterance (Halliday, 1967; Crystal, 1969;

Williams, 1986; Tench, 1996). Brazil suggests that at least one, but

usually two, prominent syllables delimit the tonic segment. The way in

which syllables are assigned prominence rests on the pragmatic

intentions of the speaker and what Brazil terms an existential paradigm.

The paradigm consists of what possible choices could appear in each of

the syntagmatic slots of the tone unit based on both the constraints of

the language system and on the non-linguistic situation or the situated

context of the interaction. For example, given a potential tone unit such

as 'a parcel of books lay on the table', at least two possible prominence

selections could be made (capital letters indicate prominent syllables):

a. a parcel of BOOKS lay on the TAble
b. a PARcel of books lay on the TAble

In (a) the speaker presents a prominent choice of 'BOOKS' as opposed to

perhaps flowers or cups, and makes a similar prominence choice

regarding the location, i.e., on the table as opposed to on the floor or

on the chair. The choice of prominence on both syllables projects a

situated context in which both these pieces of information are

unrecoverable either from the prior interaction or from constraints

within the language system. Equally, by choosing not to make

prominent certain other words in the unit, the speaker assumes that no

choice needs to be made from the existential paradigm. This may be

based on non-linguistic or linguistic factors. For example, a choice of

'box' of books (another possibility in the paradigm) can be considered

synonymous to the choice of 'parcel', and books can be assumed to 'lay'

on a table as opposed to 'stand up'. Constraints on the possible choices

in the language system apply to the nonprominent function words such

as 'of' and 'on'. In (b) the speaker chooses to make 'parcel' prominent

and 'books' nonprominent. This projects a context in which other

possibilities from the appropriate paradigm are unlikely as 'books' is

understood as having been already negotiated:

A: Was the book there?
B: There was a PARcel of books there

The way in which these understandings are achieved can range from

constraints in the language system or immediate context, to less

restricted contexts such as assumed cultural knowledge, for example, the

non-prominent '5th' in 'SAKS 5th AVenue' for an American English

speaker and the non-prominent 'hardy' in 'FREEman, hardy and WILlis'

for a British English speaker.l

Support for the function of prominence in projecting the

speaker's understanding of the negotiated status of a given item comes

from both mishearings and prosodic repairs. In the following example,

the mishearing ('eight' instead of 'ace') causes B to project a context in

which 'eight' is already determined and therefore, realized non-


A: Which ace did you play?
B: The eight of HEARTS
(Brazil, 1997: 27)

Prominent syllables are divided into two categories based on

where they appear in the tone unit: the first prominent syllable in the

1 Freeman, Hardy and Willis is a national chain of shoe stores in
Britain. This example comes from Brazil (1986).

tonic segment is called the onset and the last is called the tonic

syllable.2 It is the pitch level and pitch movement on these syllables

that forms the basis for the assessment of their communicative value

within the three systems that comprise the model. The systems realized

on these two syllables are key, realized on the onset syllable:

termination, realized on the tonic syllable3; and tone, also realized on

the tonic syllable. Key and termination will be discussed together as

they are closely related, followed by tone.

Both key and termination choices are analyzed under a three term

system that divides the speaker's pitch range into three levels: high

(H), mid (M), and low (L). Clearly, for any given speaker, an indefinite

number of absolute pitch levels may be identified, and absolute pitch

level may be affected by a number of factors including individual

idiosyncracies, emotional involvement (Bolinger, 1988) or sociocultural

convention (van Bezooyen, 1984). However, once we abstract away from

these factors, Brazil suggests we are left with a small number of pitch

contrasts used to convey purely linguistic meaning.

Both key and termination pitch choices are also glossed with the

same communicative values. Choice of high pitch on the prominent

syllable denotes the constituent (or the matter of the tone unit) as

either contrastivee' with something derivable from the preceding

2 Brazil et al. (1980) suggests that it is possible to have intermediate
stressed syllables between these two if they form part of the informing
content of the tone unit; however, this pattern usually occurs in
particular styles of speech (see later discussion).

In cases where there is only one prominent syllable in the tone unit,
both key and termination choice fall on the same syllable.

discourse (including both linguistic and non-linguistic factors) or

'particularized', i.e., highlighted as crucial over and above the

surrounding information. (In the following examples, both key and

termination are realized on the same syllable):4

M //he took the exAM// and
he did not pass, as you might have expected:
(Sinclair & Brazil, 1982: 144)

Mid pitch choices have an additive function and denote the constituent

as an 'expansion' or 'enlargement' of the information in previous units:

M //he took the exAM// and FAILED
he did both: additive
(Sinclair & Brazil, 1982: 144)

Finally, a low pitch choice signifies an 'equative' value in relation to

previous units, giving low key an additional restrictive function. It may

be a reformulation of the previous unit, or some kind of recognition that

no new information is added:

M //he took the exAM// and
as you would expect; from what you know of him you
will assume that taking it involves failing it: equative
(Sinclair & Brazil, 1982: 144)

Turning now to examples in which key and termination are

realized on different syllables, separate choices on these systems allow

Following Brazil's conventions, in all the following examples, '//'
indicates a tone unit boundary, onset syllables are given in capitals and
tonic syllables are capitalized and underlined. Key/termination levels are
indicated by H, M, & L.

the projection of a finer context of interaction, and a more detailed

analysis of speaker assumption and intent:

A: It's three o'clock
B: H
M GO//
L //TIME to
Here the message would be, 'I take three o'clock' as equivalent in
meaning in this context to 'time to go' (indicated by the choice of
low key), and'I assume you will agree' (mid termination predicting

mid key 'yes, i agree'). (Brazil et al., 1980: 77)

This interactive use of the key and termination systems allows the

speaker to 'suggest' the appropriateness of certain reactions by

the hearer. In the following example, Brazil suggests the speaker

invites an "adjudicating response" from the hearer with a use of

high termination, i.e., 'consider whether he ought or ought not be

ashamed of himself', and anticipates concurrence or approval of

the proposed action with the use of mid termination on 'tell him


H SHAMED of himself//
M //he OUGHT to be a

M and i'm GOing to TELL him so//
(Brazil, 1997: 59)

In terms of previous analyses of the English intonation system, there is

nothing inherently new about identifying a small number of linguistically

contrastive pitch levels for any given speaker (Pike, 1946; Halliday,

1967; Crystal, 1969; Tench, 1996). However, Brazil's proposal differs

from these treatments in two important respects.

First, one level is not given as the 'norm', i.e., the level the

speaker will deviate from for specific (and largely attitudinal) effects.

In Brazil's model, values are derived on a relative basis. Key choice is

identified by its relative pitch height as compared to the pitch of the

key choice in the previous unit, and termination choice is identified

relative to the key choice in the same unit. As Couper-Kuhlen (1986)

notes, this allows for more precise recognition of pitch height changes

than a system that establishes a series of fixed levels. However, it

raises a different problem: How to categorize a specific pitch level

choice that may be only marginally lower or higher than a previous

choice, compared to one in which the actual FO change is much greater.

These difficulties, as with those that come with a fixed level system,

reflect the problem of dealing with the gradient nature of the systems

measured in prosodic analysis, i.e., Fo, amplitude and length. It is

suggested here, in agreement with Couper-Kuhlen (1986), that these

potential problems for analysis can be alleviated by analyzing key and

termination choices within a minimally fixed framework, i.e., the voice

range of the speaker. The first onset key is identified within this

range, and subsequent levels are identified as appreciably 'higher than'

or 'lower than' the preceding key or termination choice (see the sample

analyses shown in Chapter 3). A certain amount of flexibility must

remain within any system that attempts to describe these features, as Fo

changes are conditioned by both time, which causes declination, and

position in the discourse, which results in an expansion or flattening of

the intonation contour near the beginning and ends of prosodic units

(Vaissiere, 1983; Levelt. 1989; Beckman, 1997). Second, stemming from

the level analysis, Brazil posits a form of tonal collocation, i.e. the

extent to which adjacent tones display predictable restrictions (Crystal,

1969). Changes in pitch level are constrained by movement between

adjacent levels only. Therefore, no tone unit exhibits a high key and

low termination, or low key and high termination, and there is a further

adjacent level constraint across tone unit boundaries. In formalizing

and systematically incorporating the notion of key or relative onset level

into the model, Brazil's proposals differ from those made in a number of

other current research models (Brown, Currie & Kenworthy, 1980;

Gussenhoven, 1983; Pierrehumbert & Hirschberg, 1990). However, key

choice analysis is critical in establishing pitch range interactions across

tone units both in the discourse of one speaker and in interactions

between speakers, and it has been recognized by a number of other

researchers (Couper-Kuhlen, 1986; Tench, 1996; Wennerstrom, 1997) as a

necessary construct to investigate prosodic units larger than the tone

unit or intonational phrase.5 These units, or phonological paragraphs,

are readily incorporated into Brazil's model and are discussed below.

There are two kinds of phonological paragraphing proposed in this

model: pitch concord and pitch sequences. Pitch concord describes

pitch range interactions between speakers. Brazil proposes that in

exchanges, following the consequent introduction of a new range of

pitch norms, the second speaker will aim to match her initial key choice

to the final termination choice of the first speaker in response to

5 Wennerstrom (1997) in fact, incorporates a form of key analysis into
Pierrehumbert & Hirschberg's model for this reason.

whatever 'invitation' is projected by the first speaker. This is

exemplified in the following examples:

(a) A: H
M //Do you underST D//

B: H
M //YES//

(b) A: H STAND//
M //Do you under

B: H //YES/
L (Brazil, 1997: 54)

In (a) the use of mid termination by speaker A is not so much a request

for a decision as an invitation to confirm that A's assumption ('i think

you do understand') is correct. Speaker B supplies this expected

concurrence with a mid key 'yes, I do'. In (b) on the other hand, the

use of high termination can be glossed as: 'Tell me, do you or do you

not understand?' and speaker B's response as asserting 'yes, there is

no question of me not understanding'. A similar example was recently

overheard on a college campus:
A: H WAS it//
M //it WASN't my FAULT//

B: H //NO// of COURSE it

In this example, speaker B responds to speaker's A request to

adjudicate ('tell me, was it or wasn't it my fault') with a high key

suggesting there is no question that it was not her fault. As Brazil

notes, there is no absolute requirement that a speaker must obey the

concord rule. However, when a second speaker does wish to refuse the

invitation offered by the speaker, she may choose to do so indirectly by

realizing the expected key choice on a "dummy" item such as the mid

key choice on 'well' shown below:
A: H
M //i COULDn't go// COULD i//

M //WELL// i think you
(Brazil, 1997: 56)

The second construct, the pitch sequence, is a stretch of

consecutive tone units that fall between two low termination choices. It

may be uttered by one speaker or shared between two participants in

an exchange. It typically delimits longer sections of speech and may be

related, in terms of communicative value, to the next or previous pitch

sequence, or to the constituent tone units within it:

Pitch sequences resemble sentences and exchanges in
that they exhibit a kind of running down of the
constraints that unify them. By saying that low
termination is the realization of a pitch sequence
closure we are recognizing that the unit ends when
the constraints that derive from a particular kind of
language organization are reduced to zero.
(Brazil, 1985: 182)
The following example of a pitch sequence closure marks the boundaries

of a typical teacher-student exchange with a final low key on the

evaluation 'good':

T: H
M //WHAT's the final ANSwer//

S: H
M //sixTEEN//

T: H //NOW//.....
M //sixTEENJ/
L //OOD//

In addition, the example shows the teacher beginning a new pitch

sequence with the high key frame 'now'.

In longer narratives or monologues by one speaker, pitch

sequences create relationships with each other of 'separateness' or

'connection'. A low termination pitch sequence closure may be followed

by a high, mid or low key choice which carries the same communicative

value contrastivee, additive or equative) as key choices within tone

units; however, these are external key choices that reflect the speaker's

projection of the relationship of one pitch sequence as a whole to the

pitch sequence preceding it. A high key choice marks a point of

maximal disjunction from the previous sequence and may mark major

semantic or structural boundaries in the discourse. A mid key pitch

sequence carries a value of enlargement, expansion, or addition to the

preceding sequence, and a low key sequence closes off a prosodic unit

and may be associated with reformulations or asides which typically

have a reduced pitch range (Beckman, 1997: Tench, 1996).

A tendency for pitch concord between speakers has been noted by

a number of other researchers, particularly those working with

conversational interaction (Couper-Kuhlen & Selting, 1996). In addition,

pitch sequences most closely parallel the paratone structures that have

been discussed by researchers working with long stretches of narrative

discourse (Yule, 1980; Brown, Currie & Kenworthy 1980; Brown & Yule,

1983; Couper-Kuhlen, 1986). Major paratones are identified by a high

key onset and a low termination (or extended pause) consistent with

Brazil's high pitch sequence boundaries.6 Yule (1980) and Couper-

Kuhlen (1986) also discuss a minor paratone structure; however, only the

latter recognizes relative onset key which would make minor paratones

coextensive with Brazil's mid and low key pitch sequences. The nature

of the model also allows for new developments in prosodic paragraphing,

and this will be discussed below. In sum, phonological paragraphing is

a relatively new area of discourse analysis that can be fully

investigated using the key and termination options proposed in this


The third and final system posited in the model is that of tone.

This is concerned with pitch movement rather than pitch level and

appears in addition to the termination choice on the tonic syllable. Tone

denotes the status of the content of the tone unit, i.e., whether it is

'new' or 'given' within the context of the interaction. Brazil recognizes

five tonal contours:

S- : fall (p); rise-fall (p+)

J" : fall-rise (r); rise (r+)

: neutral tone (o)

Excluding for the moment the neutral tone choice, the four possibilities

can be divided into two opposing pairs: rising and falling. Tones that

end in a falling movement are termed proclaiming tones. The use of

these tones signifies the content is new, i.e., not recoverable from the

preceding discourse, or is asserted, i.e., as necessary or

incontrovertible truth or fact. Tones with a rising movement are termed

6 In fact, these are more likely to be coextensive with the sequence
chain boundaries proposed by Barr (1990) and discussed below.

referring tones and signify that this information is already

"conversationally in play" i.e., assumed to be known or recoverable from

the preceding discourse or non-linguistic context. In the following

examples, a teacher is providing examples of commonplace 'rubbing

movements' in order to demonstrate the concept of friction to her


(1) //p when you strike a match//
//r it's a rubbing movement//

(2) //r when we rub our hands together
//p we are causing friction//
(Brazil et al., 1980: 14)

(1) can be glossed as 'talking of rubbing movements, another (new) kind

is striking a match'. (2) reverses the organization of 'new' and 'given'

and can be glossed as 'all these examples of rubbing movements (such

as rubbing our hands together) are causing something new I will

introduce to you called friction'. Thus, tone choice summarizes the

'common ground' between speakers at any particular moment in a given


As with choices of key and termination, the speaker operates on

the basis of her assessment of the state of convergence between herself,

the hearer and the message. This assumption of common ground can be

seen most clearly in cases where the hearer(s) cannot confirm the

correctness of the assumption directly, yet some state of convergence is

projected. In the following example, a news announcer in Britain

assumes that the name of the prime minister of Britain will be known to

the audience (hence the 'r' tone) whereas the name of her French

counterpart may not:

//p the prime MINister//r mrs THATcher//
//p the prime MINister//p raymond ARRE//
(Brazil et al., 1980: 18)

However, It is also important to remember that choices are under the

speaker's executive control (Levelt, 1989). In other words, speaker

intention can override any 'expected' choices that may be anticipated

based on context. For example, the system allows the speaker the

option to project a state of convergence that has not existed until that

moment, i.e. choose tones as if something had already been negotiated.

The tonal system is also used to reflect soclolinguistic variables

such as differences in social status between speaker and hearer, or

"social distance", i.e., whether interlocutors are intimates or strangers

(Wolfson, 1988). For example, the '+' tones (r+/p+) carry the same

information value as their r/p counterparts; however, Brazil suggests

they carry an added value of dominance. Choice between the regular

and '+' version of these tones is often based on the status relationship

between participants of the discourse where the '+' tones, as dominant

tones, are the prerogative of the controller of the discourse or the

participant who claims control.

In cases where the status of participants is unequal, e.g. teacher-

student, doctor-patient interactions, division of tone choices along

dominant/non-dominant lines is more easily identifiable. The example

below was heard in a college classroom where the teacher was a rather

timid Chinese ITA with limited language proficiency and potentially

ambiguous dominant status for the American listener. At one point,

after several repetitive checks by the ITA on student comprehension,

the following exchange occurred:

T: Does everyone understand? Are there any questions?
S: //p+ NO// p+ just go ON//p+ PLEASE

Despite the ostensibly polite form, judging by the reactions of the

observer and other students in the class, this response was clearly

perceived as disrespectful and as 'overstepping' the teacher-student

boundary. This was seen as an example of the inappropriate use of a

dominant proclaiming tone by the lower status participant in the context

of this interaction:

The assumption of dominance in circumstances where there is an
ongoing expectation that the speaker in question will accept a
non-dominant role can sometimes amount to rudeness.
(Brazil, 1997: 86)
These interpretations are very dependent on the sociolinguistic

context of the specific interaction in which they are used. In

interactions between intimates or status equals, for example, use of '+'

tones may represent not so much a dominant function, as a function of

intervention or reminding, in that the speaker takes a positive initiative

in invoking common ground (r+) or changing the world of the hearer

(p+). Dominant tones may also be used in interactions between

strangers when a certain situation briefly confers dominant status on

one of the participants; for example, when a pedestrian is giving

directions to a passing motorist.

The final possible tone, the 'o' or level tone7 is unique in that it

places the constituent outside the context of the interaction, i.e., it is

neutral in terms of its communicative value, and the speaker is

essentially marking it off from the surrounding informative content.

Halliday (1967) does not recognize a neutral tone as part of his main

SThis tone may also be realized with a slight low rise.

tonal inventory, as he suggests it is rarely used in normal, everyday

conversation. This, in fact, supports Brazil's contention that use of

the tone places information outside of the Interactional context;

something that presumably most Interactants would not want to do

unless for very specific reasons. Halliday (1967), Crystal (1969) and

Brazil all suggest that the level tone is used for semi-ritualized or

routinized language behavior such as choral prayer or giving directives

in the classroom: "//o stop WRITing//o PUT your PENS down//" (Brazil,

1997: 138). Brazil also identifies another very specific use of level tone

in the classroom in a routine formula used by teachers and recognized

by students called the template technique in which the teacher invites

the students to complete a sentence with the correct information:

T: //o and then I...//
S: Natural log of both sides

Despite what would seem to be isolated occurrences of this tone, it

plays a significant role in Brazil's model, In distinguishing two types of

discourse that have important implications for successful interaction

between participants. The reader will recall that the participants in any

given interaction are involved in a process of reaching a mutual

understanding of the status of the information being given and

received, and that the tonal system is an essential part of this process

as it indicates whether the speaker is projecting the matter as shared

common ground or new information. This use of the intonation system

for the benefit of the hearer's comprehension is termed direct discourse,

as the speaker is directly orienting intonational choices toward a state

of convergence. Brazil suggests, however, that the system also allows

for a speaker to select choices that are not oriented toward the listener

and do not place a given utterance/utterances in a relationship with

other parts of the discourse message. In effect, the speaker

temporarily withdraws from the context of the interaction, and the

communicative values inherent in the system are temporarily suspended.

In this case, choices in the system create oblique discourse, i.e an

orientation inward toward the language specimen rather than outward

toward the hearer.

The principal characteristics of an oblique orientation are the use

of a level 'o' tone in combination with a proclaiming tone, and multiple

prominences within a single tone unit. For example, in normal

conversation a speaker may decide to include a familiar quotation ('you

can TAKE a HORSE to WAter but you CAN'T MAKE him DRINK'). An

utterance presented in this manner can be glossed as "these are not my

words addressed particularly to you on this occasion; they are rather a

routine performance whose appropriateness to our present situation we

both recognize" (1997: 136). A second condition under which choices

indicating oblique orientation can occur is in places where the speaker

has momentary problems with linguistic coding which temporarily cause

an orientation change. Unplanned or partially planned discourse is

often filled with pause fillers and other kinds of hesitation markers

which will frequently be uttered in a level tone as the speaker's focus

shifts briefly to the language sample. This description of oblique

orientation subsumes various uses of the level tone that has been

described by other researchers in situations such as choral prayer or

other discourse events where participants recite formulaic responses or

in the hesitation phenomena commonly found in spontaneous speech. It

also applies to an activity Brazil terms 'reading out', where decisions in

the intonation systems are made on the basis of the linguistic

organization of the text rather than concern with how any given

utterance meshes with the context of the interaction.

In all the situations mentioned above, it is also possible to adopt a

directly-oriented approach. This, for example, is the difference between

'reading out' and 'reading aloud'; again, the ability to make either

choice highlights the fact that the decision lies to a large extent with

the speaker. There is no situation where a speaker must make a

particular choice; rather the system operates on the Gricean co-

operative principle that, generally speaking, speakers' contributions are

designed to be understood (Grice, 1975). As with any system in

language, this creates an area of conventionalized choices, and prosodic

composition is one way In which we identify different language events

(Tench, 1996). Classroom discourse, for example, is likely to be

characterized by certain intonation patterns such as clearly structured

direct orientation choices for informative content, pitch concord in

teacher student exchanges, and the use of level tones for formulaic

instructions (Brazil et al, 1980; Sinclair & Brazil, 1982).

Summary of Brazil's Model

In summary, the three interlocking systems of key, termination

and tone form the basic components of an intonation system in English

that has independent implications for the communicative value of the

discourse. With the inclusion of key choice, the model provides a

principled framework for the description and interpretation of

intonational structure in discourse as well as in individual utterances,

allowing investigation of structures larger than the tone unit or

intonational phrase. Choices in the three systems of the model and

structured phonological paragraphing show that spoken discourse,

whether overtly dialogic or not, is organized for the benefit of the

hearer and toward a mutual understanding by participants of the

discourse message. Certain combinations of systems also demonstrate

that intonation choice is under speaker control, and that the speaker

may exploit the system to alter the communicative value of the utterance

or, alternatively, temporarily withdraw from the interaction under

certain conditions. For these reasons, the model provides a systematic

framework to analyze prosodic structure which can then be compared to

other levels of linguistic description.

Comparison with Two Models of Intonation in English

This section will compare Brazil's model with aspects of two other

models of intonation in English (Halliday, 1967; Pierrehumbert &

Hirschberg, 1990). These comparisons are limited and selective.

However, they show in general terms why Brazil's model is considered to

be the most economic and insightful for the data analysis undertaken in

this study.

Halliday (1967)

Halliday also proposes that intonation structure consists of three

separate systems: tonality (tone unit division), tonicity (internal

structure of tone units) and tone (pitch movement on the final tonic).

Taking first the two systems of tonality and tonicity, Halliday suggests

a marked/unmarked distinction in which unmarked tone units are

coextensive with information units and syntactic clauses. For natural

data, this can be problematic. First, as Couper-Kuhlen & Selting (1996)

note, there may be no recognizable prosodic boundary between two

nuclear or tonic syllables yet only one may appear in an unmarked unit.

Therefore, boundaries are drawn on syntactic grounds even when they

are not supported by any phonological criteria. An example is shown


the prince of WALES// is visiting CARdiff//
(Couper-Kuhlen & Selting, 1996: 15)

In this case, a tone unit boundary is drawn between the two tonics;

however, 'is visiting' could be analyzed as either a proclitic or enclitic

element without materially affecting the meaning inherent in the

intonation structure, and without recourse to syntax.

Secondly, natural data is replete with identifiable pause defined or

pitch defined prosodic units that are not coextensive with traditional

syntactic units including hesitation markers, false starts and truncated

sentential structures (Crystal, 1969; Brown, Currie & Kenworthy, 1980:

Couper-Kuhlen & Selting, 1996). In Halliday's system these would be

considered marked structures yet they are a common feature of

spontaneous and partially planned spoken discourse. Similar difficulties

apply to the concept of tonicity. Halliday suggests that the internal

structure of an unmarked tone unit consists of "given" information

followed by a "new" or focal element coinciding with the tonic syllable

on the last lexical item. Once again, as Brown, Currie and Kenworthy

(980) show, in many cases in natural data, a new item may appear at the

beginning of the unit followed by a given structure:

THAT'S what I regret....
(Brown, Currie & Kenworthy, p.156)

In this case, the tonic falls on the first item. As the authors suggest,

it is only because two separate systems (given versus new, and

identification of prominent syllables by phonological criteria) are merged,

that data is forced into a marked category. In Brazil's system, 'new'

information is not connected to particular syntagmatic slots in the tone

unit. In the example above, choice of prominence in and of itself

reflects the speaker's intention to project this as informative content,

and tone choice will indicate whether the speaker believes this to be

new information for the hearer.

The third system in Halliday's model is tone. Five possible primary

tones differentiated by pitch movement, may appear on the tonic

syllable. This system is closely tied to the syntactic structure of the

discourse and also employs the marked/unmarked division:

Distinctions expressed by the choice of different
tones...belong in the realm of grammar (and within
grammar, the realm of syntax). Halliday, 1970: 21)

The following example (taken from Brazil, Coulthard & Johns, 1980:

107) exemplifies the unmarked distinction for WH- questions (falling) and

its marked counterpart. The example is followed by the equivalent tone

choices in Brazil's system:

(d) WH- question: tone 1, neutral; tone 2, mild
//1 what's the time//
//2 what's the time// ('may I ask please')
(Halliday, 1970: 27)

//p what's the TIME//
//r+ what's the TIME//

In Brazil's system, the difference between these two tonal values

would be in the assumed 'state of convergence', i.e., questions in

referring tone may be heard as in some way anticipating the answer, or

as a request to be reminded rather than told, and overtones of

'tentative' or 'deferential' are dependent on the specific context of the

interaction. Seen in this light, the supposed neutral tone for WH-

questions may be less appropriate in one particular context than it is in

another. This point is also taken up by other researchers (Crystal,

1969) who argue that it is not productive to assign tones to certain

structures, particularly different kinds of questions, as the data does

not support this kind of dichotomy:

Analysis of most varieties of English speech shows
that the issue is hardly as simple as this, it being
quite possible to have both a falling and rising tone
with each kind of question. (p.3)

These difficulties with all three tonality, tonicity and tone suggest

that intonation should be viewed as an independent level of meaning,

not as a device defined by grammatical choices.

In addition to the five primary tones, Halliday also proposes a

system of secondary tones which appear on both the tonic and the

pretonic (equivalent to Brazil's onset syllable). This system, which

Halliday calls 'key' also includes three pitch levels (as well as tonal

movements) but differs from Brazil's key system in that one level is

recognized as the 'norm' or neutral tone, and most importantly, that its

sole function is to indicate affective meaning.

Pretonic secondary tones extend from the onset to the tonic

syllable and are attached to primary tones. For example, the pretonic

on tone 1 can be a neutral, even tone, or a 'bouncing' tone, that

Halliday glosses as 'forceful or querulous':

//1 why don't you make up your mind// (unemotional)

//1 why don't you make up your mind// (for heaven's sake)
(Halliday, 1970: 32)
With Halliday's recognition of the internal foot structure of the unit,

each 'salient' syllable would bear the 'bouncing' movement. This creates

three prominent syllables:

//p WHY don't you MAKE up your MIND //

This multiple prominence pattern alone would separate the unit from the

surrounding discourse and suggest a stronger focus on the message

itself, rather like giving an instruction. In fact, the same effect can be

achieved using falling contours on the pretonic segment, and the

substitution of a p+ dominant tone implies even more 'forcefulness':

//p+ WHY don't you MAKE up your MIND//

However, within Halliday's system the equivalent of the p+ tone, tone 5,

attaches to its own secondary pretonic tone and is glossed as

'awestruck or disappointed':

//5 LOOk at that MARvelous old STEAM engine//
(p. 33)
These examples demonstrate that great care needs to be taken in

separating intonational effects from the effects of the lexical items

themselves. In the following example of Halliday's tone 1 with a

'bouncing' pretonic, it is difficult to assign a 'forceful or querulous'


//p JOHN's deCIDed to beCOME a DOCtor//

Again, the interpretation seems to be more like some kind of concern

with the way the message is being said as though it were being quoted

or somehow distanced from the speaker largely due to the effect of the

multiple prominences. Intonation clearly has an affective component

(Bollnger, 1988); however, there is a danger in applying too many

precise labels and unnecessarily complicating the tonal inventory. This

is particularly true of affective meaning, as there are many other

prosodic and paralinguistic variables that are invariably involved, such

as loudness, extra-heavy stress, rate, tension, choice of lexis and

kinesics (Crystal, 1969; Tench, 1996). In addition, there is the issue of

separating universal indicators of some kind of emotional effect from

language specific conventions (Bolinger, 1988; Vaissiere, 1983). Certain

prosodic features such as a change in volume or an increase in tempo

may be universally recognizable, whereas other more subtle effects may

be more language specific. In the discussion of this example given

earlier: //p+ NO//Just go N// PLEASE//, it was suggested that the

effect of rudeness was at least partially conveyed by the use of

dominant p+ tones by an unequal participant. In the same context, in a

language other than English, the attitudinal effect conveyed by this

contour may be very different. At the very least, discussion of

intonational correlations with affective meaning show that examples

should be analyzed as they occur in individual speech communities, and

in authentic contexts of interaction.

In summary, this brief examination of Halliday's model suggests

that intonation choices should be interpreted Independently and

uncoupled from grammatical categories and attitudinal labels in order to

investigate their contribution to discourse.

Pierrehumbert & Hirschberg (1990)

This second more recent model has also been used in the

comparative analysis of NS and NNS discourse (Wennerstrom, 1997, 1998).

In agreement with Brazil, Pierrehumbert & Hirschberg propose an

independent system, based only on phonological form, which assigns a

primarily pragmatic function to intonation choices:

We propose that a speaker chooses a particular tune
to convey a particular relationship between an
utterance, currently perceived beliefs of a hearer or
hearers and anticipated contribution of subsequent
(1990: 271)

Unlike the tonal contour analyses discussed above, the model comprises

a series of static tones or tonal targets that together with a series of

phonetic implementation rules, determine the shape of the Fo contour.

There are two groups of tones: pitch accents and boundary tones.

There are six pitch accents (H*, L*, H* L, H + L*, L* + H, L + H*)

which occur on stressed or 'salient' syllables and mark the information

status of the item. For example, high pitch accents mark the 'new'

information on the following example:

The train leaves at seven
H* H* H* (p. 286)

The second group of tones are those that associate with the right

edge, or closing boundary of either intermediate phrases, or intonational

phrases (L%, H%). Phrases are identified by phonetic criteria and

pausing. As the end of an intonational phrase is also the end of an

intermediate phrase, this creates four possible 'complex' tones at the

end of an utterance. The following example exemplifies a typical

declarative contour:

The train leaves at seven
H* H* H* L L% (p. 286)

Final boundary tones also indicate whether a section of the discourse is

complete (LL%), or if further discourse is required for its interpretation

(HH%). Finally, a number of automatic phonetic implementation rules

also apply. Two of the most significant are an upstep rule which raises

a L% boundary tone after a H phrase accent, and a catethesis rule

which causes a gradual declination of pitch across a phrase.

Many of the tonal combinations that are identified by

Pierrehumbert & Hirschberg and the values attached to them bear a

great deal of similarity to Brazil's interpretations. For example, the

following contour an H* pitch accent followed by an L phrase accent

and a L% boundary tone is said to "convey new information" in much

the same way that Brazil's proclaiming tone adds a new variable to the


Legumes are a good source of vitamins
H* L L%
(p. 272)
If the L phrase accent is followed by a H% boundary tone, the contour

becomes equivalent to Brazil's mid termination referring tone which is

synonymous with Pierrehumbert & Hirschberg's gloss of "when S

believes that H Is already aware of the information, if S wishes to

convey that it is mutually believed" (p. 290). The next example was

spoken by a young woman who was asked after a movie If she liked it

and is made up of both a H phrase accent and H boundary tone:

I thought it was good
H* H* H H% (p. 290)

This is glossed as 'I thought it was good, but do you agree with me?'

and corresponds to Brazil's interpretation of the adjudicating value of

high key ('I would like a yes/no response'). In a final example, the

authors suggest the L+H* LH% marks background information:

A: What about the beans? Who ate them?
B: Fred ate the beans
H* L L + H* L H% (p. 296)

The gloss here is 'as for the beans, Fred ate them', and this fall-rise

pattern corresponds to Brazil's referring tone for information already

established in the discourse.

Final boundary tones also play a less defined, but similar role to

Brazil's termination choices. For example, Pierrehumbert & Hirschberg


An H boundary tone indicates S wishes H to interpret
an utterance with particular attention to subsequent
utterances. An L boundary tone does not convey such
directionality. (p. 305)

An example of this is given below:

a. Attach the jumper cables to the car that's running
L H%
b. Attach them to the car you want to start
L H%
c. Try the ignition
L H%
d. If you're lucky
L H%
e. you've started your car
L L% (p. 306-7)
With the operation of the phonetic implementation rules, phrases (a) -

(d) end with a mid termination, and (e) ends with a low termination

corresponding to Brazil's pitch sequence closure.

At this level of comparison, there is clearly a strong resemblance

between the two models in their mutual conception of the function of

intonation in discourse and some similarities in how these are realized

by Fo values. Both also claim that only salient or prominent syllables

make up the meaning-bearing elements of the contour, although P & H

also account for the phonetic variations between these syllables. Finally,

both also recognize that intonation structuring extends beyond

individual tone units and that this is signalled by the final choices)

made in the unit.

However, there are also some notable differences in the

interpretation of phonological constructs and in the recognition of

boundary tones, two of which are discussed below. In the P & H model,

pitch accents apply to individual salient items, and an unlimited number

of syllables can be stressed in any given Intonational phrase;

consequently, there is no discussion of the possible effect of multiple

prominences. Returning to an earlier example, 'the train leaves at

seven', the high pitch accents would be analyzed under Brazil's system

as //the TRAIN LEAVES at SEven//, an utterance only likely to occur in

a situation where someone is being particularly insistent: "you know the

TRAIN LEAVES at SEven (and you're going to miss it unless you hurry

up') and interpreted as a change in orientation as the speaker

'pronounces' the information. This level of interpretation is not

discussed by P & H, as they are largely concerned with describing the

status of individual items rather than the effect of prominence choices

on the unit as a whole.

Finally, there Is no discussion of phrase initial, left edge

boundary tones. The boundary tones proposed by Pierrehumbert &

Hirschberg only apply to the end of utterances, and there is no

suggestion of the possibility of equivalent initial boundary tones.

Consequently, there is no discussion of pitch concord, or of the

possibility of a larger phonological paragraphs marked by both initial

and final pitch values. While Pierrehumbert & Hirschberg do suggest

that a low boundary indicates some kind of closure, they do not

examine this issue any further. In summary, the interpretive model

provided by P & H up to this point, offers less insight into the larger

prosodic units currently being in investigated in discourse.


This limited comparative discussion of two models of the intonation

system in English emphasizes the importance of recognizing prosody as

an independent structuring device interacting with, but not necessarily

defined by, other language systems in the discourse. In addition, it

highlights the importance of an interpretive system that can offer

insight into the larger prosodic units currently being investigated in

discourse. For these reasons, I conclude that Brazil's model provides

the most comprehensive and explanatory framework for the analysis

conducted in this study.

Additions to Brazil's Model

In this final section, I discuss two additions to Brazil's original

model that are used in the analysis presented in the following chapters.

Both are included as they apply specifically to the data used in this

study. They offer additional insight into the structuring of classroom

discourse by native speakers and incorporate previous findings

regarding the prosodic features of typical nonnative speaker teaching


Sequence Chains (Barr, 1990)

In recent work that applies Brazil's model to the analysis of

native speaker lecture discourse, Barr (1990) identifies a unit of

intonation structure termed a sequence chain. Sequence chains

formalize a group of pitch sequences. The opening boundary is indicated

by the use of high key or one of the lecturing frames typically found

in teaching discourse, i.e. OK, NOW, SO; and the sequence chain closes

with a low termination:

[The sequence chain] is above the pitch sequence and
is defined as a string of pitch sequences such that
the first pitch sequence and only the first sequence
begins with a high key...Thus minimally, a sequence
chain consists of a single high key-initial pitch
sequence, but maximally, a sequence chain consists of
Indeterminate numbers of pitch sequences such that
any non-initial pitch sequence is either mid/low key
and therefore additive or equative to the previous
one. (p. 11)

Barr suggests that sequence chain boundaries will begin with an

introductory topic expression and are coextensive with the level of

lecture organization found In the layout of prepared visuals such as

overhead projections, the blackboard or handouts. Sequence chain

boundaries may also parallel changes in discourse plane (Sinclair &

Brazil, 1982); that is, shifts in the area of attention of the discourse

such as a movement from talk about the content of the class to talk

about the organization of the class. In the example below, a sequence

chain consisting of three pitch sequences marks a typical plane



(7) ////p and i've MENTioned//o

what their MAIN decision

///it's this inVESTment//p in SHARES//o

whether to BUY//o whether to SELL//p

///o and i WANT to take that//r ARGument
//p a STAGE

The boundary between this and the following sequence
chain marks a change back to the content of the

CISion to in
(8) ////p the de VEST//

which is the beginning of another content chunk.
(Barr, 1990: 15)

As this study also investigates teaching discourse, I have decided

to include an analysis of potential sequence chain structure. However,

because of the particular style of classroom discourse examined here,

there was a difficulty in applying Barr's criteria of co-occurring visual

cues as support for sequence chain interaction with other levels of

discourse organization. Barr's data consists of concept-based lectures

where professors used a variety of prepared visual aids. The data in

the present study are typical of the style of short prelab presentations

given in introductory science laboratory classes (Jacobson, 1986). They

SPitch sequence boundaries are marked by '///' and sequence chain
boundaries by '////'.

are much less formal in nature, and there are no accompanying prepared

materials. TAs used only the blackboard as a visual aid, usually in a

less systematic manner than might be found in a longer, more formal

lectures. For these reasons, it was not possible to draw the same

parallels between visual aids and SC structuring in the discourse.

In place of Barr's original criteria, I have drawn on the

transaction structure proposed by Coulthard & Montgomery (1981) and

Shaw (1994) to investigate co-occurring cues at other levels of discourse

organization. A transaction is a "chunk" of discourse containing a

unifying topic and defined by prospective and retrospective markers at

its boundaries. In an analysis of NS teaching discourse in university

engineering and business and management classes, Shaw suggests that

typical focussing markers include both verbal and non-verbal cues. For

example, lexical phrases such as 'for the first part', micro-markers such

as 'ok' and topic length pauses while the professor scans her notes or

scans the audience. Shaw's analysis of the phonological structuring of

transaction boundaries, however, is limited to a brief discussion of the

use of rising or falling intonation on micro-markers. In this analysis I

propose to unite both Barr and Shaw's findings and investigate the co-

occurrence of phonological cues indicating sequence chain structure with

transaction boundary cues. Places where these are coextensive are seen

to be evidence of the speaker's intention to organize the discourse for

the benefit of the hearer by providing a series of cues at different

levels of discourse organization. Where Barr's original criteria do apply,

their relationship to co-occurring transaction boundary cues as well as

sequence chain structure Is discussed.

In the example given above for instance, the high key lexical

phrase 'I've already mentioned' is a prospective focussing marker. The

high key on the lexical phrase unites cues at different levels of

discourse to indicate a structural boundary related to a change in

discourse plane. The same applies to the low termination on the

retrospective marker 'I want to take that argument now a stage

further.' Additional visual cues coinciding with the second sequence

chain would be a possible further addition to the constellation of cues

highlighting this boundary.

In summary, Barr's sequence chain structure formalizes a final

level of intonation structure that can be incorporated into Brazil's

original framework and provides further evidence of intentional use of

the intonation system by the speaker organize the discourse.

Pause Analysis

The second addition is pause analysis, and is included in order to

complete a comparison of the prosodic structure of NS and NNS data. In

production and perception studies of pause boundaries in Dutch and

English, Swerts & Gerlykens (1994) and Swerts (1997) found that pauses

are longer for major than minor topic shifts and that longer pauses

increase perception of boundary strength. Vaissiere (1983) suggests a

universal tendency for pause defined units in spoken discourse, with

pauses between sentences being longer than pauses within sentences.

Analyses of nonnative speaker data show a qualitative difference in both

placement and length of pauses which can materially affect the overall

prosodic structure of the discourse. In a pilot study of two parallel

lecture extracts, one given by an NS TA and the other by a Chinese

ITA, I found that pauses in the NNS data were both longer and more

erratic than those in the NS data and tended to regularly break up

conceptual units.

In agreement with Rounds (1987) my data were also characterized

by empty pauses, regular moments of silence unrelated to boardwork or

for dramatic effect, which Rounds suggests artificially increase the

amount of silence in the discourse, creating a negative perception of the

ITA. In light of these differences in pause structure between NS and

NNS TAs, and its potential to disrupt the overall prosodic structure of

the discourse, as well as the use of pauses to cue transaction

boundaries, I decided a principled discussion should be included in the

present analysis. Brazil does not elaborate on pause patterns apart

from noting that they may and frequently do coincide with tone unit

boundaries; however, one group of researchers (Brown, 1977; Brown,

Currie & Kenworthy, 1980; Brown & Yule, 1983) has developed a model

identifying pause defined units in discourse. They identify three major

groups: pauses of 0.8 seconds or longer constitute topic boundaries and

"clearly coincide with major semantic breaks" (p. 56). These are called

topic pauses. The second group vary between 0.6 and 0.8 seconds and

are referred to as substantial pauses which tend to coincide with single

contours. The third and final set, very short pauses, vary between 0.2

seconds and 0.4 seconds and are identified as a 'sub-set of the contour

pauses'. This final group frequently co-occurs with incomplete syntactic

structures. This model will be taken as a first approximation for pause

analysis in the data.9


The model outlined in this chapter allows the analysis and

interpretation of pitch movements over time within a principled

framework uniting both the form and function of intonation in English.

It proposes a hierarchical system of prosodic units which together

provide an independent layer of structure to the discourse and

contribute to the pragmatic message contained within the discourse as a

whole. The identification and interpretation of phonological paragraphs,

and their interaction with other levels of discourse organization is a

relatively recent undertaking; however, it is clearly a potentially

powerful organizational tool used by speakers in their production and

interpretation of discourse. Lastly, in providing a comprehensive

framework with which to describe the intonation structure of native

speaker discourse, the model offers a way to undertake a systematic

comparative analysis of prosodics in nonnative speaker discourse.

Figure 2-1 summarizes the systems proposed in Brazil's model.

These researchers are working with spontaneous speech not lecture
discourse, and different genres can affect prosodic patterns such as
pause structure (Crystal & Davy, 1969).



Pitch defined conceptual units bounded by a high key or
lecturing frame and low termination

Pitch defined conceptually related units bounded by two
low tonic syllables

Pitch defined units, may coincide with syntactic/pause
boundaries. Each unit contains 1 or 2 prominent syllables


Key choice

Low key: Equative
Mid key: Additive
High key: Contrastive

Termination & tone choice

0 tone: neutral
P/P+ tone: new content
R/R+ tone: recoverable

Figure 2-1. A Model of Discourse Intonation (Brazil, 1997; Barr, 1990)



This chapter describes the procedures used in the collection and

analysis of the data investigated in this study. More specific

information regarding individual teaching presentations will be given at

the beginning of each section of the analysis.

The study is based on 56 minutes of data from teaching

presentations given by 16 male teaching assistants teaching introductory

labs in chemistry, physics, and electrical engineering and a pre-calculus

math discussion section. The TAs represent three language groups:

native speakers (NS), non-native speakers (NNS) and speakers of an

indigenized variety of English (IVE) (Sridhar & Sridhar, 1992), and

Indian English (IES) as shown in the table below.

The groups of nonnative and Indian international teaching

assistants (ITA) were chosen based on their score on the ETS SPEAK

test and their language backgrounds. All ITAs received 45-50 on the

SPEAK am, which categorizes the speakers' communication skills as

"somewhat to generally effective" in terms of the ETS guidelines. ITAs

with overwhelming problems in one area of linguistic skill such as

segmental pronunciation were not included. All the ITAs were the sole

instructor responsible for their lab or discussion section and were

recorded in their first semester of teaching. The six nonnative TAs

were from mainland China, and their first language was Mandarin

Table 3-1. Teaching Assistants


Chinese. The four Indian TAs were from both North and South India

and their first languages are Tamil (1), Urdu (1), and Bengali (2).

These TAs also spoke Hindi and a number of local Indian languages to

varying degrees of competence and were educated in English medium

schools from an early age. Native speaker TAs were contacted through

the supervisors of the courses in question. All the TAs in this native

speaker group were described as "relatively experienced" but none were

specifically described as "model" TAs.

Where possible, the 16 TAs were recorded on the same day or in

the same week in order to compare parallel teaching presentations. This

was possible in all but two cases, and parallel discourse extracts are

shown on the table below. The two presentations with no parallels are

marked with an asterisk. The data represent a cross-section of typical

functions performed by TAs in these prelab presentations, Including

giving theoretical background to the experiment, reviewing homework,

explaining relevant terms or equations and demonstrating experimental

procedures (Jacobson, 1986; Axelson & Madden, 1994).

Table 3-2. Teaching Presentations


Chemistry Unknown 3 1 NS TA
Analysis 2 NNS TAs
Thin Layer 2 1 NS TA
Chromatography 1 NNS TA
Physics Torques and 4 2 NS TAs
Forces in 1 NNS TA
Equilibrium 1 IES TA
Math Exponential 3 1 NS TA
Growth and 2 NNS TAs
Electrical Drawing a Bode 2 1 NS TA
Engineering Plot 1 IES TA
*Ideal and 1 1 IES TA
Practical Diodes
*Using the 1 1 IES TA

Data Collection & Analysis

The data were recorded in the classroom on audio and videotape

using a Sony TCD-D8 Digital Audio Tape-corder, a Sharp VL-L490U VHS

Camcorder, a Telex FMR-150C Wireless system, and a Telex SCHF745

Headset microphone. The wireless sound system and headset microphone

allowed the TA complete freedom of movement while the researcher

remained at the back of the room with the sound and video equipment.

This method of collection produced high quality sound and video

recordings appropriate for Instrumental analysis, without the problems

typically associated with natural data collection in a classroom. DAT

recordings were transferred to a Kay Elemetrics Model 4300

Computerized Speech Laboratory (CSL), and fundamental frequency (Fo)

traces were computed for all the data using the pitch extraction

function of the CSL at a rate of 10,000 samples per second.

All data were subjected to both auditory and instrumental

analysis. Brazil's original model was based on auditory analysis, and his

published work includes only a few examples of oscilloscope traces

produced in the laboratory. Although a number of analyses of natural

data using the model have since been published (e.g., Hewings. 1990)

none of these have included any discussion or presentation of

instrumental work. As is true of any model where a fit Is attempted

between theoretical categories and actual data, particularly where this

involves gradient characteristics, the researcher must make numerous

decisions regarding whether a given phonetic realization constitutes a

variation within one category or a change of category. The addition of

instrumental evidence provides a permanent visual record of the basis

for these decisions and addresses the issue of internal reliability, i.e.,

"the degree to which other researchers given a set of previously

generated constructs would match them with the data in the same way

as the original researcher" (Edge & Richards, 1998: 9). For these

reasons, Fo traces have been included in the analysis as pictorial

representations of the constructs proposed in the model and to show

how gradient Fo movements have been analyzed.1

Some examples from this data set are given below to show how

typical transcription choices were made. The diagrams are printed out

1 Precedents for using both auditory and Instrumental analysis to
investigate intonation structure in discourse can be found in Watt (1997)
and Schuetze-Coburn, Shapley & Weber (1991).

directly from the CSL and show amplitude and Fo readings from portions

of these data. Pitch level and movement are indicated by the dotted

lines in the lower box on the diagram (marked PITCH). Voiceless

segments cause breaks in the Fo contour, and the articulation of both

voiced and voiceless obstruents can cause noise which results in

pockets of random dots at a higher frequency than the actual Fo

contour (see, for example, Figure 3-2). Figures 3-1, 3-2 and 3-3 contain

samples of key and tone choices and Figures 3-4 and 3-5 show examples

of hesitation markers. Momentary coding problems causing false starts,

hesitation markers, filled pauses and so on are typical characteristics of

spontaneous or partially planned speech. In agreement with Hewings

(1990), I continued to use Brazil's conventions to transcribe these

features. For example, prominent hesitation markers such as the one

shown in Figure 3-5 were transcribed as level 'o' tones. Finally,

several of the speakers in the data set occasionally exhibited creaky

voice or vocal fry. As shown in Figure 3-6, the pitch extraction

function of the CSL was unable to read this data. In these cases,

auditory analysis was used for transcription decisions.

B PITCH 0.97670< V>

ques tion three

0.977 Tii <(se) 2.175

Figure 3-1. An Example of the Transcription of Key Choices in a Series
of Adjacent Tone Units. a) The first tone unit begins with
a low key choice on 'question', and moves up to a mid
termination on 'three'; b) The second tone unit consists of
a mid key marker 'ok'. The following unit begins in a high
key on 'find'.

B>PTCH .i.53410< a>

k find the half -fe of uh

o k ________find the half life of uh



Tir (sea)

14 .38


Figure 3-1, continued.

EB'PITCH 7 91i88B 156)

expoNENtial growth and decay

7.915 Ti-. (sTc) 9.322

// GROWTH and //
Figure 3-2. An Example of the Transcription of Key and Termination
Choice in a Single Tone Unit.'Exponential' and 'growth' are
transcribed as high and mid key prominences. This is
followed by a low termination on 'decay'.

B>P[TCH S. 16798< 8>

---- .------' I-*., ~ -- .,,o -

v ....

when the meter stick is in balance

8.168 T. (s T ) 3 839

l//p+ HEN //
//p the MEter stick is in BAlance //
Figure 3-3. An Example of the Transcription of P Tone Choices. There
is a rise-fall P+ tone on 'when', followed by a falling P tone
on 'balance'.

B>PIICH Z7.25428< 0

uh this is X N

27.254 Time (sac) 31.888


Figure 3-4. An Example of the Transcription of a Short, Non-prominent
Hesitation Marker 'uh' at the Beginning of the Tone Unit.

CB>PITCHO 45.34998< 8>


S...........- .. .......... ......

uh the se cond part

45.356 Ti-e (sc.) 47.824

// UN //

Figure 3-5. An Example of the Transcription of a Prominent Hesitation
Marker. This long hesitation marker 'uh' is transcribed as
a level tone.

.SPlripH 32 26;aa. a

because it'll be balanced

32.268 Tin. (.< ) 35.146


Figure 3-6. An Example of the Effect of Creaky Voice. There is no clear
Fo contour for the phrase 'because it'll be balanced' as the
CSL is unable to effectively read the data.


Transcription Conventions

All data was transcribed according to Brazil's transcription

conventions with the addition of the conventions used by Barr (1990) to

indicate sequence chains, and my own to indicate pause structure. A

summary of these is given below:

Onset Syllable:

Tonic Syllable:

Pause boundary:

Length of pause:

Pitch Sequence boundary:

Sequence chain boundary:

H: high
M: mid
L: low

Tones: p, p+
r, r+




[ ]



key & termination choices

proclaiming tones
referring tones
neutral/level tone

In order to simplify the reading of the examples used in the text,

some transcription features have been excluded if they are not

immediately relevant to the discussion.



This chapter presents the analysis of the native speaker (NS)

data. The analysis shows evidence of a systematic and independent use

of prosody by the speakers in this sample, which supports both the

structure and interpretation of intonation in discourse proposed by

Brazil. Choices within the systems of key, termination and tone are

consistent with the hypothesis that the teaching assistants intentionally

use intonational cues both to mark structural boundaries in the

discourse and to negotiate a common ground with their students. The

analysis also suggests that intonation structure consistently interacts

with other levels of discourse organization, and that prosodic cues

operate in conjunction with other structural cues to assist the listener

in the interpretation of the discourse message. Based on these results,

It is argued that intonation choices should be viewed as interactive in

nature, i.e., organized for the benefit of the hearer and as contributing

independently to the overall comprehensibility of the discourse.

The results of the analysis are divided into three areas of

intonation structuring: sequence chains, pitch sequences and discourse

markers, and lastly, tone choice and orientation. The chapter begins

with a more detailed description of the data included in the native

speaker group.

Native Speaker Data Set

A summary of the NS data set is given below, followed by a brief

description of the content of each of the teaching extracts:

Table 4-1. Summary of NS data


SN 4 MINS 113

LE 4 MINS 100

MK. The opening of a chemistry prelab presentation. The students

are about to begin an unknown analysis for which they first have to

complete and hand in a scheme, i.e., a plan of how they will conduct the

analysis. The TA is reviewing the procedures that should appear in the

unknown analysis scheme.

SN. The opening of a chemistry prelab presentation. The students

are beginning a Thin Layer Chromatography experiment. The TA is

demonstrating the procedures and equipment the students will use.

KN. The opening of a physics prelab presentation. The students

are conducting an experiment investigating torques and forces in

equilibrium using a meter stick and some weights. The TA is explaining

the procedures, the physics equations the students will be testing, and

pointing out a potential confusion the students may encounter near the

end of the lab.

LE. The opening of a physics prelab presentation. The students

are conducting the same torques and forces experiment as in (KN)

above; however, in this extract, the TA is reviewing a question the

students had difficulty with in the prelab homework.

BD. This extract comes from the middle of a 45-minute prelab

lecture the course supervisor asked the TAs to give in electrical

engineering. Students are about to conduct an experiment testing a

mathematical equation that relates input to output voltage. One subtopic

was chosen, in which the TA explained how to plug the experimental

results into the equation, and graph these findings using a Bode Plot.

BL. For the pre-calculus math discussion sections, students

complete a set of problems for homework prior to the class. Students

then choose a number of these problems they would like the TA to

review on the board. In this extract, taken from the middle of the

class, the TA reviews a question from a section on exponential growth

and decay. Each problem is presented as a complete discourse event

bounded by long pauses as the TA erases calculations from the board,

checks the next question in the textbook and so on. Therefore,

presentation of one problem only was chosen for this analysis.

Sequence Chain Structure
The reader will recall that the sequence chain (SC) structure

proposed by Barr (1990) suggested that larger prosodic units bounded

by a high key or lecturing frame and a low termination could be found

in teaching discourse. Her analysis also proposed that SCs coincided

with shifts in discourse plane and the layout of prepared visuals such

as overheads or handouts. As noted in the previous chapter, due to

the lack of prepared materials in these prelab presentations, this

analysis will focus on co-occurrence with plane changes and transaction


Sequence chain structure was readily identifiable in the data set

analyzed here. There were 36 SCs in totally (between 5 and 9 SCs were

found in each extract). 15 of the SC openings began with a mid or high

key lecturing frame such as //SO//, //oK// or //NOW// and the

remaining with a high key. SCs closed with a low termination on a

content word or a structural discourse marker such as //oK//, and in

one case (discussed below in Figure 4-2) a low key filled pause. The

length of SCs varied between 12-25 tone units across speakers and

typically consisted of a focussing boundary or frame in one tone unit

followed by a number of tone units containing a topic expression and

development and a final tone unit or small group of units forming a

closing boundary. SC boundaries did coincide with changes in discourse

plane, and the majority were clearly coextensive with transaction

boundaries denoted by other non-prosodic criteria.

Typical examples of SC structures coextensive with shifts in

discourse plane are illustrated in Figures 4-1, 4-2. and 4-3. Figure 4-1

shows the final tone units of the first SC in MK's presentation and the

opening tone units of the second. The SC boundary separates the first

part of the presentation concerned with the organization of the class

1 The final SC in MK and SN were not analyzed in their entirety as
they were very long. In both cases, I stopped the transcription at a low
key pitch sequence boundary.

from the presentation of the main content 'for our unknown, we have

seven ions we have to test for'.

M //r p section FOUR everybody's there GOOD// [0.07]
L RIGHT alright

M //p that's about as far as it's necessary// [0.5] //r cos I'm

M BAsically only gonna go over our POsitive IONS// [0.73] //p and

H FOR our
M BRIEfly over the ////o p but
L NEgative ions// [1.0]

M uniKNO we HAVE em//[1.7] //r SEven ions we have to TEST for//

Figure 4-1. Co-extensive Sequence Chain and Plane Change Boundary in
MK's Presentation.

Figure 4-2 shows a series of two adjacent sequence chains from

LE's transcript. The first SC begins with a high key focussing

expression 'so you guys had problems with the prelab right', followed

by LE reading aloud the problem question. This SC closure is the only

example in this data set of a low key filled pause being analyzed as a

SC boundary (note the mid key choice on 'zero'). Support for this

analysis is found in a number of co-occurring cues such as the shift in

discourse plane as LE moves to the blackboard to explain the problem

('the way this thing goes is'), the topic length pauses either side of

the filler and the behavior of the TA who clearly scans the audience

before moving toward the blackboard. Finally, figure 4-3 illustrates a

shift in discourse plane and co-occurring SC boundary as KN moves

from talking about the content to initiating a direct exchange with the


H ////p p r+ so you GUYS had PROBlems with the PRElab RIGHT/

M //o o AND the FIRST question WAS uh// [4.42] //p QUEStion ONE

M was// (0.77] //p for the exAMple on pages four and FIVE// [0.4]

M //p FIND out TORques// [0.92] //p r+ for an AXis at x equals ZEro

M and show that their SUM is still ZEro// [1.95]
L //? UH// [4.85]

H ////the WAY this// [0.7] //p r+ thing
M GOES IS we...

Figure 4-2. Coextensive Sequence Chain and Plane Change Boundary in
LE's Presentation.

Shifts in discourse plane occur frequently in the classroom as the

teacher moves from 'telling something' to 'talking about telling

something' or to 'asking something', and the sequence chain structuring

illustrated above increases boundary strength at these points in the


Turning now to the co-occurrence of sequence chains with

transaction structures, there was a marked correspondence between SC

M //p r+ and this is the CENter of
L MASS aGAIN// [0.5] //o UM// [2.0]

H ////p p p if you were to HANG SOMEthing TEN CENtimeters
M aWAY//

M //o p how much MASS would you have to HANG so that THAT would

M be in roTAtional equiLIbrium// [2.85] //r+ does anybody KNOW//

Figure 4-3. Coextensive Sequence Chain and Plane Change Boundary in
KN's Presentation.

boundaries and a constellation of cues marking a transaction boundary.

24 of the 34 complete sequence chains were also marked as transaction

boundaries and coincided with both prospective and retrospective

marking by non-prosodic means. A further four SCs coincided with the

kind of shift in discourse plane illustrated in Figure 4-3. The six

remaining SCs co-occurred with either prospective or retrospective

marking. Transaction boundary cues were identified using Shaw's (1994)

criteria, and a brief description of the examples found in this data set

is given below.

The most typical of the 28 prospective markers found in the data

were high or mid key lexical phrases (10)

FIRST thing you wanna do is//
// the

the WAY this thing
// GOES IS//

//in case//

or high or mid key lecturing frames (15) such as: //SO//, //oK//, or

//NOW//. In three cases, the boundary was marked with a non-

prominent //ok//, followed by a high key topic statement.

The 29 retrospective markers found in the data were divided

fairly evenly between recapitulation statements (7) in a mid or low key

with a low termination:

//that's just a sorta explanation of the
//so that's the BAsic gist of the
or lexical micro markers (11) such as //oK// and //SO// in a low key.

Speakers also used topic length unfilled pauses accompanied by a

preceding low key choice (10) as they scanned the audience and the


Figure 4-4 illustrates coextensive transaction and sequence chain

boundaries. This shows a series of two adjacent sequence chains and

the beginning of a third from SN's presentation, in which the structural

markers separate a series of instructions given to the students

concerning the equipment they will be using for the experiment. The

first SC begins with two prospective focussing markers in a high key

'ok for TLC you're gonna need several pieces of equipment' and 'first

off', and the final transaction and sequence chain boundary co-occur

with the recapitulation 'you're gonna make your own little developing

chambers' ending in a low termination and accompanied by a topic

length pause. The second SC, as SN moves from discussing the

developing chamber to the chemical solvent, is marked with a non-

2 The final case is the filled pause in LE's transcript discussed above.

prominent focussing marker and a high key 'ok the solvent you're

gonna use'. Again, this ends in a recapitulation 'so it's there if you

forget what solvent system to use', ending in a low termination.

As with the shifts in discourse plane, the constellation of cues

provided by the co-occurring transaction and SC boundaries indicate

places of maximal disjunction in the discourse, i.e., points where the

language organization binding one group of pitch sequences or tone

units is completed.

Regarding Barr's original criteria based on prepared visual

materials, there was one teaching presentation (MK) in which 'real-time'

boardwork always coincided with a new SC boundary and was used to

emphasis structural boundaries. This is shown on Table 4-2. In the five

other presentations in this data set, while SC junctures frequently

marked a change in teacher activity such as writing on the board (see,

for example, Figure 4-2), boardwork illustration was used to exemplify

items described in the discourse such as particular equations or

diagrams of equipment rather than to additionally mark structural

boundaries in the discourse. As noted earlier, there were six SC

structures in the data that did not coincide with both prospective and

retrospective transaction boundary markers and exemplified two further

features of typical classroom discourse. First, the kind of teaching

presentations found in this data are examples of partially planned

spoken discourse that is subject to the effects of 'online production'

such as hesitation phenomena and repairs. Figure 4-5 shows BD

initiating a repair structure from the low key 'take the magnitudee' to

the high key 'take the twenty log of the magnitude' which then begins

H TIC you're gonna need SEveral// [0.68] //p pieces of
M ////p ok for

H FIRST OFF// [0.17]
M eQUIPment// [1.2] //p ok //o o you're gonna

M NEED one of your two hundred and FIFty milliliter BEAkers//

M //p and one of your WATCH GLAsses// [0.5] //ok this is gonna be

M now your deVEloping CHAMber for the// [0.45]
L //o UM// [0.74]

M //p TIC//[0.57] //p you're gonna make your OWN little developing

H SOLvent you're gonna
M ////p ok the USE to
L CHAMbers//// [1.14]

H PLATES// [3.6] //p is ethoLAcetate// 12.4] //r and
M deVElop the

H this IS in the NOTES// [0.4]
M //p SO// (0.82] //p it's THERE if

M you forget what SOLvent system to ////p oK//
L USE//// [0.82]

Figure 4-4. Coextensive Sequence Chain and Transaction Boundaries from
SN's Presentation.

a new sequence chain. In this case, it is not clear whether the SC

opening is an intentional structural boundary or a result of a rise in

key typical of repairs initiated by a speaker to ensure correct

Table 4-2. Visual Cues for Sequence Chain Structure in MK's


//but FOR our unKNOWN we Na+, K+, Nh4+, OH-, NO3-,
HAVE em// SEven ions we Cl-, HsO4-
have to TEST for//
//one of the FIRST things 1. Flame test
that we did was a FLAME
//the SEcond set of TESTS 2. Colbaltinitrate test
we did was that
cobaltiNItrate TEST//

interpretation of the message (Cutler, 1983). Earlier in his presentation,

BD has already made it clear that it is important that the students

remember to take the twenty log of the magnitude,3 suggesting that this

is may be the reason for this particular high key choice. As Sinclair &

Brazil (1982:31) note, spoken discourse is made in real-time, and many

different considerations can lead to occasional ambiguous or

indeterminate utterances.

The five remaining SCs coincided with activities outside the text

itself. In a typical classroom setting, there are a variety of activities

that accompany the presentation of the informative content. These are

described by Coulthard and Mongtomery (1981) as forming a

paradiscourse subtext. Paradiscourse includes activities directly related

to the content such as boardwork and demonstrating equipment and

3The second sequence chain in BD's presentation focuses on this
point: 'The reason they have the twenty log times the magnitude of the
function is because whenever you take the log of something, instead of
multiplying you can Just add'.

H ////p p p TWENty
M LOG of the
L //so if you TAKE the mag-// [0.2]

M MAGnitude of BOTH sides of THIS// [1.0] //eQUAtion then you just

M GET// [1.24] //p p TWENty LOG of K ZEro//

Figure 4-5. A Sequence Chain Boundary Following a Repair Structure in
BD's Presentation.

more incidental actions such as opening windows or asides commenting

on the lack of chalk or an eraser. Coulthard & Montgomery suggest

that these actions can also shape features of the discourse text,

particularly prosodic organization.

In this data, the five SCs which did not co-occur with transaction

boundaries were directly related to this paradiscourse subtext. A

typical example of the interaction between a procedural aside related to

the boardwork and SC structure is shown in Figure 4-6 from MK's

presentation. The first SC boundary coincides with the prospective

marker and topic statement 'alright, the second set of tests we did was

that colbaltinitrate test'. This is followed by a combined pause of more

than seven seconds while he writes this on the board and adds the

chemical notations. It closes with a low terminating procedural aside 'I

think that's right' directly related to the equation he has written on the

board. The high key on 'remember' signals the end of this aside and

technically begins a new SC although this is clearly the same topic.

H SEC- SEcond set of
M [2.43] ////p o r alRIGHT er the TESTS we did was

H TEST// [4.9]
M that cobaltiNItrate //r+ i THINK
L //um// [3.6] that's

H reMEMber when you
M ////p p r+ if you
L RIGHT// [0.55] //er// [0.66]

M DID that it WAS a it formed a yellow preCIpitate for both er

H aMMOnium// [0.92]
M poTAssium and

Figure 4-6. An Example of the Typical Interaction between a Procedural
Aside and Sequence Chain Structure from MK's

Figure 4-7, taken from SN's presentation, shows a high key on

'pour it in there' after a low terminating aside directly addressed to the

students regarding the chemical he is using in the demonstration, 'this

is not etholacetate, don't use it'. Again, although this technically begins

a new SC, the topic is clearly a continuation of his discussion of the

etholacetate solvent. Interactions with the paradiscourse subtext occur

throughout the presentations in this data set and affected all levels of

prosodic structuring investigated here (sequence chains, pitch

sequences and tone units).

These examples highlight the interactive nature of discourse

organization and the need to take into account different levels of

structuring on a moment by moment basis in order to give a principled

account of any one level. When viewed in conjunction with the

paradiscourse subtext, key changes which resulted in SC structuring

apparently unmotivated by co-occurrence with transaction boundaries or

obvious shifts in discourse plane could be reasonably explained and,

presumably, reasonably interpreted by the hearer(s).

Summary of Sequence Chain Structure

The analysis of sequence chain structure found in the NS data

suggests that this unit of prosodic organization is used consistently by

the TAs in this sample to organize the discourse for the benefit of their

students. Points of maximal disjunction in the prosody bounded by high

key or lecturing frames and low termination were matched by a number

of other focussing boundaries which together operated to divide the

discourse into a series of "chunks" usually coinciding with topic


H TAKE the ethoLAcetate this is
M //p o r+ then NOT ethoLAcetate

H ////p POUR it In
M THERE// [1.0] //so it
L DON'T USE it// [3.47]

H JUST// [0.05] //o COvers the
M BOTtom and you've got MAYbe// [0.5]

M //p r+ OH a CENtimeter or SO//[0.5] //p or a little bit LESS than a

M centimeter of SOLvent on the
L BOTtom//

Figure 4-7. An Example of the Typical Interaction between a Procedural
Aside and Sequence Chain Structure from SN's

Prosodic structuring of sequence chains was also shown to reflect the

online nature of spoken discourse production and a close relationship

with the paradiscourse subtext which forms an integral part of

classroom discourse. For the purposes of this analysis, intonational

features have been discussed largely independently of lexical content;

however, it is clear in the majority of cases that lexical content and

choice of key support each other, particularly in the kinds of high key

lexical phrases that are often used to open transactions and the low key

markers that signal their completion.

One final point should be made about the interaction of SC

structuring and topic structure in particular. As noted by Levelt

(1989: 385) in a discussion of intonational phrases which I think applies

equally well here, "[A] break decision is under the speaker's executive

control". The speaker's intent can outweigh any other considerations

and will create exceptions to any patterns that can be established in the

data. A typical example is shown in Figure 4-8. This is the only

example in this data sample where SC structures extended no further

than a focussing boundary and a topic expression. The opening of BL's

H ////p oK// [0.7] //p EXponential
M GROWTH and deCAY//// [3.4]

M ////p oK//[0.32] //p this is EXponential
L GROWTH// [1.57]

Figure 4-8. Two Short Sequence Chains Coextensive with Topic
Pronouncements from BL's Presentation.

presentation is divided into two topic announcements. First the overall

topic of the section 'exponential growth and decay', and then the

subgroup this problem is part of 'this is exponential growth'. Each SC

acts. in effect, as a 'pronouncement' of the topic using proclaiming

tones and lecturing frames. The choice to present the information in

this way creates an unusual series of short sequence chains compared

to the rest of the data in this sample. However, it is likely that were

more data analyzed, equivalent exceptional cases would be found.

Examples such as this reflect the independent nature of intonation

structure which need not be defined by other levels of discourse

organization. Speakers may choose to exploit any part of the system

within the given parameters.

Pitch Sequences and Discourse Markers

This section is divided into two parts. The first part will focus

on the pitch sequence (PS) structure found in the data and the

relationship between PSs and their Internal tone unit structure. The

second part examines discourse markers, and particularly, the speaker's

choice of key on these markers and how this interacts with pitch

sequence closure.

Pitch Sequences

Pitch sequences consist of a group of tone units bounded by low

termination choices. The number of pitch sequences per sequence chain

varied quite widely both within and between speakers (between 1-12

pitch sequences per sequence chain); internally, PSs ranged in length

from one tone unit containing a low key marker such as //oK// to

longer pitch sequences containing a number of tone units. This

variation meant that quantitative comparisons were not as productive as

a qualitative analysis of the kinds of information typically associated

with pitch sequence structure and how this was reflected in choice of

key. To exemplify this, I have used Coulthard and Montgomery's (1981)

classification of classroom content.

Coulthard & Montgomery (1981) divide classroom discourse into two

overall types of content: main and subsidiary. Main discourse consists

of the informative content of the presentation, and it may be

interspersed with various kinds of subsidiary content such as the

comments relating to the paradiscourse subtext that were discussed in

the previous section. The category of subsidiary content subsumes a

variety of teaching purposes from short glosses or asides that enlarge

upon, exemplify or recapitulate informative content, to much longer

chunks of discourse concerned with the organization of the class.

A typical example of this kind of subsidiary content is illustrated

in Figure 4-9. This figure shows the first sequence chain in MK's

presentation, which is made up of subsidiary content, followed by the

beginning of the second sequence chain, which marks the boundary

between this and the beginning of the main discourse, or informative

content 'for our unknown we have seven Ions we have to test for'. The

first PS can be glossed as "getting the students attention" by first

framing and focussing a topic expression 'Ok, begin about today', and

then adding a mid key 'invitation' to the students to gather round the

board. The second mid key pitch sequence adds further subsidiary

content described by Coulthard and Montgomery as a gloss, i.e., a

comment on previous information often containing an attributive term; In

this case 'it's a great time to see if you like it'. The following mid key

unit consists of a procedural aside to check the students are ready to

begin the unknown. A high adjudicating key is used for the yes/no

question 'everybody has made it up to at least section 4 on lab 2?' and

this is followed by a mid key repetition. This pitch sequence ends with

the low key aside 'except for you'. MK's use of equative low key

reflects both the parenthetical nature of the comment which is

addressed to one student in particular rather than the whole group, and

that he is evidently aware that this student is behind the others based

on his nonverbal reactions. As is typical following a low key aside, MK

raises the key of his next tone unit which opens a new pitch sequence

directed back to the group as a whole. The final pitch sequence ends

the direct exchange with the students using a typical concurring

response 'good' given in a mid key and a falling tone, and adds a final

comment regarding what MK will cover in his presentation. Immediately

after this pitch sequence closure, MK opens the new sequence chain

marking a shift in discourse plane from class organization to class

content. Within the first sequence chain, the PS structure marks the

boundaries between 'talking about telling' and 'asking' and further

distinguishes between comments directed to one student and to the

group of students as a whole.

This extract is the most complex example of relationships between

different kinds of subsidiary content and PS structure found in the

data. More frequently, pitch sequence boundaries separated one piece

of subsidiary content from the main content surrounding it as shown in

Figure 4-10. In this sequence chain from SN's presentation, there are

two pitch sequences. The first begins with the high key focussing

marker 'now the first thing you wanna do' as he begins telling them

how to mark the TLC plate. The main content continues until the tone

units containing 'and there's rulers in the stockroom'. This second unit

ends with a low terminating aside 'and I don't have a pencil with me' as

SN realizes he does not have a pencil to demonstrate exactly what the

students should do. This closes the pitch sequence and SN then rises

to a mid key to complete the informative content. The mid key choice

reflects the continuation of the topic, i.e., 'the first thing to do with

your plate is to mark it' which was cut short by the aside.

A connection between pitch sequence closures and low key

equative asides mirroring boardwork was also a consistent pattern and

on several occasions, two low terminations marked the boundaries of a

"paradiscourse" unit, i.e., a unit of structure consisting solely of

boardwork. Figure 4-11 shows an example from BD's presentation where

there were several extremely long pauses (close to 60 seconds) while he

wrote on the blackboard. The low termination following the boardwork

functions as a final boundary cue and is followed by a new sequence

chain marked by a lecturing frame.

A final pattern that emerged from this data was groups of low key

units directly following each other forming a series of separate pitch

sequences. This pattern is not discussed in Brazil's work or that of his

colleagues. Brazil (1997) suggests that "pitch sequences having initial

low key tend to be short, often amounting to no more than one tone

unit In length" (p. 124). While this is mostly true of the low key PSs in

this data, particularly in their most common use to mark procedural

H K em// p r beGIN about
M ////p o toDAY i'm just gonna go over our

M unknown aNAlysis SCHEME//p cos it'll BEnefit anybody who's going

M to need to be WORKing on it//pp so if you wanna gather ROUND

M gonna do it up here on the HAVE ONE it's a
L BOARD// ///p p if you

M great time to check to see if you ///p o o r uh before i
L LIKE it///

H EVerybody has TWO//
M START MADE it up to at LEAST section FOUR on lab

H TWO everybody's made it at LEAST that
M //r p r+ section FOUR on lab

M FAR RIGHT// p exCEPT for ///r p section FOUR everybody's there
L YOU///

M ///p GOD// p that's about as FAR as it's NEcessary//

M //r cos I'm BAsically only gonna go over our POsitive IONS//

Figure 4-9. An Example of the Interaction between Main and Subsidiary
Content and Pitch Sequence Structure from MK's

H FIRST thing you wanna //WITH your tic
M ////p now the DO// PLATE

H HOPEfully you've got a
M PENcil// r+ o alRIGHT and they'll there's

H RUlers//
M //p r in the STOCKroom and i don't have a PENcil with
L ME///

H MARK// p at least a ONE
M ///p but what you wanna DO is you wanna

H centimeter
M LINE// p down the BOTtom of this PLATE//

Figure 4-10. An Example of the Interaction between Pitch Sequences
and Low key Subsidiary Content from SN's Presentation.

M ////p SO//
L //p EM let's SEE// [45 seconds boardwork] //p ok//

Figure 4-11. A Paradiscourse Unit Consisting only of Boardwork from
BD's Presentation.

asides, several times in the data an extended pattern of low key initial

units was found which was unrelated to the paradiscourse text. Examples

are given in Figures 4-12 and 4-13. Figure 4-12 shows the second

sequence chain in BD's presentation. The first sequence chain contains

the topic announcement and a definition of a Bode Plot:

The Bode plot's just a plot of the twenty log of the
magnitude of the frequency response against omega.

The next sequence chain, shown here, marks a shift in discourse plane

(i.e., a shift in the area of attention of the discourse) as BD talks about

what the presentation will cover. It begins with a lecturing frame and

focussing marker followed by an mid key enlargement 'we're just gonna

M ////p SO// p so toDAY// p we're just gonna LEARN how to PLOT//

M //p SIMple//p SIMple//p er BOde plots of the FREquency

L //p p p p and THEN/// you can LEARN/// about the Other KIND///

L ///in your CLASS/// p that are a little bit MORE COMpllcated///

M ///p p for this LAB THIS'll
L ///p NOT MUCH/// but// DO///

Figure 4-12. A Series of Low Key Pitch Sequences from BD's

learn how to plot simple Bode Plots'. The prominence choice on 'simple'

projects an existential paradigm in which 'simple Bode Plots' contrast

with other types of Bode plot. The series of low key units that follow

confirm this paradigm. The sequence of mid and low key units can be

loosely glossed as: 'I've told you we are going to look at simple Bode

Plots today, and I assume you understand that this means there are

other kinds of plots, but I will repeat this already understood

assumption now'. The low key units consist of this reformulation and a

low key gloss 'that are a little bit more complicated, not much'. When

the focus returns to the main content, i.e., what the students will need

for this lab, BD rises to a mid key 'for this lab, this'll do'.

Figure 4-13 shows a similar kind of reformulation in a series of

low key units from BL's presentation. The sequence chain begins by

describing one of the three variables the students will use to solve the

equation (hence the use of the high key to distinguish R from the other

two) and the definition of this variable constitutes one pitch sequence.

The following PS begins in a mid key, enlarging on the first 'it's

getting bigger' and providing a specific example 'you're getting more

money'. This is followed by a series of low key units that personalize

the previous example 'you want that, you want more money'. As in BD's

extract, when BL returns to the main content, i.e., the distinction

between a positive and negative R, he moves to a mid key.

In summary, this analysis showed that speakers' made key and

termination choices that created pitch sequence structures which

were related to each other within sequence chains, and internally

between tone units. PS structure emphasized the boundaries between

main and subsidiary content and demonstrated a similar kind of

relationship with other levels of organization as that seen in sequence

chain structuring.

Discourse Markers

The analysis of pitch sequence structure also highlighted the

speakers' use of discourse markers; particularly, what have been called

frames or micromarkers such as SO, NOW, OK, rather than the longer

lexical phrases also used to mark transaction boundaries. In analyses of

H ////p and R's what's
M CALLED// p it's a GROWTH
L CONstant///

H BIGger//
M ///r if r's POsitive the THING's getting //r you're

M getting MORE MOney// //r+ RIGHT//
L //you WANT// r r+ you WANT

L THAT///you want your money to GROW in a BANK/// ///pppp

M NEgative the STUFF is
L getting more bacTEria/// whatever if R is

M is getting SMALler it's deCAYing//

Figure 4-13. A Series of Low Key Pitch Sequences from BL's

teaching discourse and particularly those related to this model, only

high and mid key markers appearing with a proclaiming tone and termed

framing devices are discussed (Sinclair & Brazil, 1982: Barr, 1990).

Other research investigating these markers in transactional discourse

suggests that framing devices are part of a larger set that combines

both pragmatic and semantic functions and can operate as both

structural boundary markers and logical connectors (Flowerdew &

Tarouza,1995: Nattinger & DeCarrio, 1992). Nattinger and DeCarrio

further suggest that lexically equivalent markers operate differently in

discourse structure depending on tone choice. For example, OK realized

with a falling intonation and followed by a pause indicates a topic shift,

whereas the same OK marker with a level intonation and no following

pause marks a summary of the preceding information. In addition, they

include a rising tone on the same lexical markers which indicates

clarification is being sought from the hearer by the speaker.

Analysis of the discourse markers in this data confirmed that high

and mid key frames formed part of a larger set of lexically equivalent,

but prosodically distinct markers along the lines suggested by Nattinger

& DeCarrio.4 Five lexical markers, Alright, Right, Ok, So, and Now

appeared throughout the presentations and produced a total of 63

markers in this data set. The prosodic composition of these markers is

summarized on Table 4-3.

Both the high and mid key proclaiming, or in a few instances level

markers, functioned as typical frames usually In sequence chain or pitch

sequence initial position, in the manner suggested by Brazil and other

Table 4-3. The Prosodic Composition of Discourse Markers in the NS DAta

MID KEY 20 10 3
LOW KEY 7 16 3

teaching discourse researchers. However, approximately a third of the

discourse markers appeared in a low key or with rising tones. Looking

Only markers that had an obvious structural function (whether that
was combined with a semantic function or not) are discussed in this

first at the low key markers, analysis of pitch sequence structure

suggested that in approximately half of these examples (12 cases)

markers operated not only as structural markers indicating transaction

boundaries, but also as dummy tone choices (Brazil, 1997) to end pitch

sequences which showed a mid key termination on the prior tone unit.

While Brazil suggests that pitch sequence closure may be achieved

through dummy tone choices, this is not included as a possible function

of discourse markers. A typical example is given in Figure 4-14.

In this sequence chain, SN is working with the equipment as he is

speaking, and this additional call on his attention appears to cause a

momentary problem with linguistic coding shown by the hesitation and

flattened intonation. The informative content 'one plate' finishes on a

mid key termination and SN indicates a structural boundary by the

addition of a low key falling marker. It is clearly the speaker's intent

H these PLATES are BIG enough that you can
M RUN your entire exPEriment//

M [1.24] //o QO// [0.861 //p ONE PLATE// [0.4]
L //p oK// [0.9]

H NOW that you've NEXT thing you wanna
M ////p p so DRAWN this the DO//

Figure 4-14. An Example of a Low Key Dummy Tone Choice from SN's

to mark a boundary here as the following tone unit indicates a new

sequence chain with the high key 'now that you've drawn this the next

thing', and it is possible that the additional focus on the equipment

interfered with the intonation structure SN intended to project in the

final unit of the sequence chain. Reconstruction of speaker intent at

this level is clearly difficult to show. Even the speaker himself would, in

all likelihood, not be able to recall this kind of online decision.

However, the number of similar cases in this data suggest that low key

markers maybe used to fulfill this function.

Turning now to discourse markers exhibiting a rising tone,

Nattinger & DeCarrio suggest these indicate the speaker is seeking

clarification from the hearer. Certainly this notion can be subsumed

under Brazil's definition of referring tones by suggesting that rising

markers are a cue to the listener that the speaker is asking for (or will

be seen as asking for) confirmation of her belief that speaker and

hearer have negotiated a common ground, i.e., that the speaker is right

in assuming the hearer can interpret the discourse message. An

example from KN's presentation is shown in Figure 4-15. KN completes

the answer to the problem he has worked through on the board, and

then follows this with a low rising OK marker followed by a long pause

in which students could confirm their understanding or ask any

questions they have.

I noted above that rising markers could be seen as asking for

confirmation, rather than genuinely asking for confirmation as in Figure

4-15. Figure 4-16 shows another rising marker 'right' from BL's

transcript. In this case the marker is followed by a barely audible

pause (0.08 seconds) and there is clearly no "wait time" for a student

response. In this case, I suggest these markers are acting rather as

M you're BAsically gonna SHOW that// [0.27] //p p p the SUM of the

M TORQUES is equal to ZEro and that's when it
L BAlances// [0.27]

L //r oK// [0.7]

lgure 4-15. An Example of a Rising Confirmation Marker from KN's

solidarity markers. They indicate to the hearer that the speaker is

aware of her audience and imply that the speaker is directly confirming

common ground. It is suggested that this is a technique used to create

solidarity with the hearers by acknowledging their participation in the

discourse, and it may be that frequent use of rising markers will

encourage more participation indirectly by giving this impression.

H BIGger// [0.35]
M //r if r's POsitive the THING's getting //r you're

M getting MORE MOney// [0.3] //r+ RIGHT// [0.08]

Figure 4-16. An Example of a Solidarity Marker from BL's Presentation.

It is probable that these markers are not discussed in Brazil's

work or the teaching studies that stemmed from it (Brazil, Coulthard &

Johns, 1980; Sinclair & Brazil, 1982) because of the different nature of

the classroom discourse used in these analyses. Brazil and his

colleagues worked with primary or middle school data, and much of this

involves "telling" in proclaiming tones. When "asking" is included, it is

usually in the context of tightly structured IRF exchanges such as the

following in which the teacher is asking for the correct response only:

T: What's the annual rainfall here?
P: About thirty inches
T: Yes, good. (Sinclair & Brazil, 1982: p. 57)

In addition, the status of the teacher as controller of the discourse in

these classrooms is invariably absolute, and confirmation of student

understanding is often achieved through the kinds of display questions

exemplified above rather than by direct appeal to the students. In

contrast, in university classrooms, especially those taught by TAs, the

relationship between the teacher and students can be more open to

negotiation and more fluid (Shaw & Bailey, 1990; Tyler, 1995). One

manifestation of this recognition of the TA as more of a "facilitator" is

the use of 'asking' rather than 'telling' and evidence of negotiation cues

such as the use of rising markers.

Summary of Pitch Sequences and Discourse Markers

In summary, pitch sequence structuring was consistently used by

all the speakers in the sample to mark relationships within sequence

chains and between tone units. Pitch sequence boundaries frequently

marked changes between main and subsidiary content by alternations in

key choice and interacted with other levels of organization such as the

paradiscourse subtext. Discourse markers marked both sequence chain

and pitch sequence boundaries, and the prosodic features of these

markers suggest that key choice is an important part of understanding

how these markers work. There is some evidence to show that low key

markers are multifunctional, acting both as structural boundary markers

and as dummy tone choices to complete pitch sequence closure. Finally,

low key markers and rising markers are added to the original set of

frames proposed by Brazil, and it is suggested that these reflect the

particular type of classroom discourse constituting this data set.

Tone choice and Orientation.

This section investigates tone choices made in the data. The

reader will recall that the system of tone choice realized both an

information function (adding new information or marking information as

assumed to be known) and also a social function in expressing

relationships between participants in the discourse (exemplified above in

the discussion of rising markers). In addition, choices in the system

projected speaker orientation. In direct discourse, i.e., discourse

oriented toward the hearer(s), speakers negotiate a state of convergence

using R and P tones. In oblique discourse, marked by 0 and P tones,

the context of interaction is temporarily suspended and orientation is

toward to the language sample. This section is divided into two parts.

The first part focuses on direct orientation in the teaching

presentations and the use of R and P tones. The second looks at the

use of O tones and evidence of oblique orientation.

The four-minute extracts taken from the prelab presentations in

the laboratory classes each contained a lower limit of 100 tone units;

therefore, tone choices were counted as percentages for the first 100

tone units of each presentation. The math extract was only two minutes

in length, and a count was made of the first 50 tone units, which was

then doubled for the purposes of describing the numbers of tone

choices across the data set. This decision was made on the basis that

presentations of each math problem in the math discussion sections seen

by this researcher are virtually identical in their organization and

composition, and the features found in the presentations given by this

TA also match those found in analyses of similar math discussion

sections conducted by other researchers (Rounds, 1987; Byrd &

Constantinidles, 1990).

Direct Orientation: R and P Tones

The following table summarizes the tone choices made in the first

100 TUs of each teaching presentation. The table is ordered by the

amount of R tone choices found in each presentation.

The tone choice counts show a predominance of P tones. This is

typical of classroom discourse which is largely involved with "telling",

Table 4-4. Percentage of Tone Choices in the First 100 TOne Units in
Each Presentation in the NS Data.

BL 62 34 4
MK 57 31 12
LE 43 29 28
KN 66 17 17
SN 72 14 14
BD 75 13 12

i.e., using proclaiming tones to present new information to the students.

The use of R tones separates the six presentations into two groups

based to some extent on the amount of information the TA can assume

the students know and what is assumed to be new. In the first group

Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EHQ7EHXHY_0IAPI4 INGEST_TIME 2013-09-28T02:00:12Z PACKAGE AA00014294_00001