Concepts and Issues in Orthographic Design

Material Information

Concepts and Issues in Orthographic Design
Bontrager, Gregory H
Place of Publication:
[Gainesville, Fla.]
University of Florida
Publication Date:
Physical Description:
1 online resource (91 p.)

Thesis/Dissertation Information

Master's ( M.A.)
Degree Grantor:
University of Florida
Degree Disciplines:
Committee Chair:
Committee Co-Chair:
Graduation Date:


Subjects / Keywords:
Graphemes ( jstor )
Language ( jstor )
Linguistics ( jstor )
Literacy ( jstor )
Orthographies ( jstor )
Phonemes ( jstor )
Phonemics ( jstor )
Pronunciation ( jstor )
Vowels ( jstor )
Words ( jstor )
Linguistics -- Dissertations, Academic -- UF
design -- language -- orthography -- phonemic -- rule -- spelling -- system -- writing
bibliography ( marcgt )
theses ( marcgt )
government publication (state, provincial, terriorial, dependent) ( marcgt )
born-digital ( sobekcm )
Electronic Thesis or Dissertation
Linguistics thesis, M.A.


Linguists are often called upon to develop orthographies for languages with no previous literacy, and the most popular script choice is the Roman alphabet due to its serving as the core of the International Phonetic Alphabet and to the traditional scope of Western influence. The most typical methodology is to analyze the host language's phonology and assign an alphabetic letter to each phoneme. Yet the task is not always quite that simple and may in fact rarely be so. Orthographies can have a kind of internal structure beyond strictly isomorphic sound-to-symbol mappings. Furthermore, sociolinguistic and cultural factors often wield a surprising impact on the success of a spelling system. In any case, there appears to be no normative or even semi-normative terminology or framework for analyzing and comparing orthographic proposals along multiple dimensions both formal and social. The forthcoming work attempts to fill that gap by proposing a terminological toolkit for the analysis of an orthography from a formal linguistic perspective as well as surveying the important socio-cultural concepts and issues that are likely to prove relevant in devising a new orthography. Ultimately, the author's hope is to better focus discussion of orthographic design choices and thereby expedite the design process by which an orthography is developed and deployed. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis (M.A.)--University of Florida, 2015.
Statement of Responsibility:
by Gregory H Bontrager.

Record Information

Source Institution:
Rights Management:
Copyright Bontrager, Gregory H. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
LD1780 2015 ( lcc )


This item has the following downloads:

Full Text




© 2015 Gregory H. Bontrager


To my grandparents, without whose constant and eager support I would be neither half the scholar n or half the man that I am today


4 ACKNOWLEDGMENTS I wou ld like to acknowledge my advisory committee, comprised of Dr. Fiona McLaughlin and Dr. Ann Kathryn Wehmeyer, for expanding the horizons of my outlook on orthography, for aiding in the procurement of valuable sources of information, and for their construct ive scrutiny of my work. Additional acknowledgements must be made to the authors whom I have cited in this project, especially the inspirational and indispensable Mark Sebba. Like many scholars, I stand upon the shoulders of giants.


5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ ............... 4 LIST OF TABLES ................................ ................................ ................................ ........................... 6 LIST OF FIGURES ................................ ................................ ................................ ......................... 7 ABSTRACT ................................ ................................ ................................ ................................ ..... 8 CHAPTER 1 INTRODUCTION ................................ ................................ ................................ .................... 9 2 TYPOLOGY OF SCRIPTS ................................ ................................ ................................ .... 12 Morphography versus Phonography ................................ ................................ ....................... 12 Types of Phonographies ................................ ................................ ................................ ......... 14 3 ANATOMY OF AN ALPHABETIC ORTHOGRAPHY ................................ ...................... 18 Correspondences and Rules ................................ ................................ ................................ .... 19 Italian and Spanish ................................ ................................ ................................ .................. 21 4 MEASURING PHONEMICITY AND GRAPHEMICITY ................................ ................... 27 5 A SOCIO CULTURAL PERSPECTIVE ................................ ................................ ............... 40 Lessons from Post Colonial History ................................ ................................ ....................... 42 A Socio Cultural Questionnaire for Orthographic Design ................................ ..................... 50 Measuring Proximity to a Model Orthography ................................ ................................ ...... 55 6 CASE STUDIES ................................ ................................ ................................ ..................... 59 Dsc hang ................................ ................................ ................................ ................................ .. 59 Wolof ................................ ................................ ................................ ................................ ...... 65 Mauritian Kreol ................................ ................................ ................................ ...................... 77 7 CONCLUSION ................................ ................................ ................................ ....................... 86 LIST OF REFERENCES ................................ ................................ ................................ ............... 88 BIOGRAPHICAL SKETCH ................................ ................................ ................................ ......... 91


6 LIST OF TABLES Table page 4 1 Phonemicity Tabulation for Spanish Orthography ................................ ............................ 32 4 2 Graphemicity Tabulation for Spanish Orthography ................................ .......................... 33 4 3 Phonemicity Tabu lation for Spanish Orthography with Weighted Values for .......... 36 6 1 Comparison of Consonants in French Based versus Standard Wolof Orthography .......... 74


7 LIST OF FIGURES Figure page 2 1 Typology of Scripts ................................ ................................ ................................ ............ 17


8 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Arts CONCEPTS AND ISSUES IN ORTHOGRAPHIC DESIGN By GREGORY H. BONTRAGER May 2015 Chair: Fiona McLaughlin Major: Linguistics Linguists are often called up on to develop orthographies for languages with no previous literacy, and the most popular script choice i s the Roman alphabet due to its serving as the core of the International Phonetic Alphabet and to the traditional scope of Western influence. The most typical methodology is to analyze the host language's phonology and assign an alphabetic letter to each phoneme. Yet the task is not always quite that simple and may in fact rarely be so. Orthographies can have a kind of internal structure beyond strictl y isomorphic sound to symbol mappings. Furthermore, s ociolinguistic and cultural factors often wield a surprising impact on the succ ess of a spelling system . In any case , there appears to be no normative or even semi normative terminology or framework fo r analyzing and comparing orthographic proposals along multiple dimensions both formal and social. The forthcoming work attempts to fill that gap by proposing a terminological toolkit for the analysis of an orthography from a formal linguistic perspective as well as surveying the important socio cultural concepts and issues that are likely to prove relevant i n devising a new orthography . Ultimately, the author's hope is to better focus discussion of orthographic design choices and thereby expedite the des ign process by which an orthography is developed and deployed.


9 CHAPTER 1 INTRODUCTION It can be all too easy for linguists to view orthography as a comparatively trivial matter. It is usually invoked merely as a convenient and practical medium through which the true object of inquiry may be examined. It is an artificial veneer thinly superimposed over natural language to make it easily reproducible and transmissible for the purpose of scientific scrutiny, while the International Phonetic Alphabet (IPA ) tends to be used only in the examination of phonological and/or phonetic phenomena, which the relevant orthography may be ill equipped to reveal. The written word is relatively seldom placed under the linguist's microscope in the same way that the gramm ar or sound system is. Some linguists may even argue that to analyze orthographic habits and/or conventions within any given speech community is much more of a concern for the host society itself rather than any outside analysts. Given the prescriptivism that is often bound up with spelling in many highly literate cultures, this is an understandable perception due to the implicit oath of descriptivism that all students take upon starting their linguistic training. This is not to say that the marginal sta tus of orthography in the scientific study of human language is completely groundless from an empirical standpoint. Strictly speaking, writing is not language. The latter existed for several millennia before the former was innovated independently in rela tively few locations throughout the world, and there remain many languages in the modern world that lack any written form. Children learn to speak their native languages with instinctive ease and relatively little targeted instruction from their elders, w hile learning to write the very same tongues requires far more deliberate and conscious effort on the part of both teacher and student. However, pre literate, illiterate, and even juvenile speech demonstrates all of the rich communicative capabilities th at, as even the general public would likely agree, define language.


10 Therefore, barring a dramatic and arguably unjustified re definition of the term "language," one must conclude, at the very least, that writing is not among the prerequisites for a commun icative system to qualify as a language. It then follows that the classification of writing as a component of language itself rather than a separate though clearly related system becomes questionable. As such, it is perhaps unsurprising that, at least to the author's knowledge, linguists have no cohesive and agreed upon formalism for analyzing and comparing orthographic systems and practices. Most of the work done in what one might call a discipline of "graphology" lies in the realm of sociolinguistics, and even there, writing has long been rather marginalized as an object of inquiry. In response to the justifications given in the preceding paragraph for the statement that writing is not language, Lillis (2013) would probably refer to them collectively a s another incarnation of "the primacy argument," which is just as potent of an obstacle to to the rigorous examination of writing in sociolinguistics as it is in formal linguistics. Aside from focusing on context, sociolinguistic principles also tend to p rize the natural and quotidian. This is because the most spontaneous and everyday forms of language are considered the most authentic and therefore more worthy of study. Conspiring with this perception are two related ideas about written language. First , it is commonly viewed as automatically less authentic and context dependent than spoken language. Second, it is also greatly associated with standardization, formality, and in turn with prescriptivism. Lillis argues against this perception of writing a s a necessarily artificial and therefore largely irrelevant form of language (Lillis 2013), and in doing so, she paves the way for written language to be scrutinized much more in light of its social and cultural context. Such a deeper and more contextual perspective can potentially shed much light on how best to approach a project of orthographic design and promulgation. However, not all linguists engaging in the development of orthographies for previously


11 unwritten languages are aware of the multiple fac ets to such a project, and t his is where the lack of any common analytical framework may have some important consequences, especially since orthographic development is probably the one way in which linguists can have the clearest and strongest impact on th e speech communities that they study and/or serve. It is also likely to be the one linguistic specialty (besides sociolinguistics itself) in which socio cultural factors can have the greatest impact on the success of the linguist's work. In addition to L illis, some other sociolinguists such as Sebba (2007) and Lüpke (2011) have further warned against designing and/or assessing an orthography base d on purely functional criteria, while Cook (2001) has argued that great er attention to writing systems can be relevant in applied linguistics as well. While the author acknowledges the importance of both functional and socio cultural perspectives on orthography, neither one of these outlooks can rely on even a semi standardized analytical or terminological appara tus to aid in the assessment of any given orthographic system along multiple dimensions for easy comparison. In other words, if we wish to carefully consider and weigh both functional and socio cultural factors, either along with or against each other, th en we must first be able to say more objectively, clearly, and concretely where exactly the proposed orthography stands with respect to each perspective. What follows is a humble attempt to lay the foundation of a basic terminology and analytical toolkit for doing exactly that. Since most new orthographies designed by linguists are alphabetic, it is on that script type that the focus will be, although morphographic systems will be discussed briefly in the forthcoming review of basic established terms.


12 CHAPTER 2 TYPOLOGY OF SCRIPTS Morphography versus Phonography Graphic scripts can be divided into two broad categories, according to which linguistic unit the individual graphemes encode. A morphography is a system in which each grapheme can be decoded into a morphosyntactic unit (i.e a single morpheme), whereas a phonography is a system in which each grapheme can be decoded into a phonological unit (i.e. a phoneme, mora, or syllable). A given writing system may be classified by decomposing a written ut terance into its smallest indivisible symbols and determining the irreducible interpretation of each individual one. If the result is an arrangement of morphemes, then the script in question is morphographic. If it is instead an arrangement of phonemes, morae, or syllables, then the script is phonographic (Rogers 2004). At this point, it may be prudent to remind ourselves that absolutely pure morphographies or phonographies are very rare if they even exist at all. We need not even turn to any of the mor e obvious cases of hybridity, such as the robust use of both morphographic kanji and phonographic kana in the Japanese script, to see this. Chinese, typically cited as the archetypal morphography, sometimes incorporates phonetic diacritics (often called r adicals ) to clarify potential ambiguity. Most or all European and other Latin derived scripts represent numbers morphographically via the Arabic numerals, even as all other words in the same text may be written phonographically. Unless otherwise stated, then, when we speak of "phonographic" or "morphographic" systems, those terms shall include those scripts that at least demonstrate solid predominance of one type of writing or the other, regardless of the ultimate purity thereof. While this work will foc us on phonographies due to their popularity among contemporary


13 orthographic designers, this focus should not be mistaken for an endorsement of phonography as an intrinsically superior strategy. Each one has certain advantages and disadvantages. As shown in the case of the Chinese script, for instance, morphographies are well suited for circumventing problems faced by languages with a great deal of dialectal variation, because a single morpheme can be unambiguously referenced by the same written glyph rega rdless of regional differences in how that morpheme is realized phonologically. To better understand this, a Westerner need only revisit the example of Arabic numerals. The symbol <2> means the same thing to all Europeans and/or Euro Americans despite it s association with very different pronunciations across different nations, such as two , dos , deux , and zwei . The vast majority of morphemes, numerical or not, are subject to that same flexibility in Chinese, whose regional varieties are sometimes said to stretch the boundary between "dialect" and "language." Morphographies also tend to render connections among related words, such as free roots and derivatives thereof, visually obvious by maintaining the same appearance of the common root across all forms. A strict phonography, on the other hand, tends to obscure such relationships by explicitly representing morphophonological alternations. On the other hand, phonographies can function with a much more concise roster of unique graphemes than morphograph ies, at most a few hundred and often only a few dozen symbols as opposed to the thousands if not tens of thousands needed in a morphography. This almost certainly leads to a much gentler learning curve in literacy acquisition. Furthermore, the very same property of morphographies described above as an advantage can become a hindrance in some socio cultural situations or environments. While morphographies leave very little room for the performance of regional identity, a typical phonography can rather eas ily serve as a very transparent vehicle for such performance. Consider, for instance, the national anthem lyrics for a


14 nation born from colonization and subsequent secession. While the former colonists may, in the vast majority of contexts, continue writ ing in a prestige register still shared with the mother country, they may also quite justifiably wish to write texts particularly intended to express patriotism according to their own accent. Phonographies generally facilitate this patriotic expression mu ch more straightf orwardly than do morphographies. Types of Phonographies The phonographic superclass may be further divided into phonemic systems and syllabic systems. In a phonemic script, each individual phoneme is encoded by an independent symbol. In a moro syllabic script, each grapheme encodes at least a mora if not a whole syllable (Rogers 2004) 1 . Ultimately, at the narrowest level of categorization, there are two subtypes within each subclass. Phonemic scripts comprise abjads and alphabets, and s yllabic scripts include abugidas and true syllabaries. An abjad is a system in which each glyph can be decoded into a consonant phoneme, but the vowels are left unwritten and therefore must be deduced from context. Most of the languages commonly cited a s examples of this type are Semitic, including Hebrew and Arabic. The letters in an alphabet , on the other hand, explicitly encode all phonemes, consonants and vowels alike (Rogers 2004). Most European scripts are alphabetic, and Western influence has re ndered the Roman alphabet currently the most popular script worldwide. Due to their current popularity and consequent relevance to the development of new orthographies, alphabetic scripts shall be the focus of the forthcoming chapters. An abugida is a s yllabic script whose freestanding letters form an abjad, which is then 1 Rogers actually identifies moraic and syllabic systems separately, b ut for our purposes, the binary distinction in linguistic unit of representation between a phoneme and any phonological unit larger than a phoneme but smaller than a word seems sufficient.


15 supplemented with a system of diacritics that signal the appropriate vowels. The most common example thereof is the Devanagari code of India. Finally, a true syllabary is defined by a roster of entirely freestanding graphemes that independently encode specific syllables or morae, nuclei and all, with no reliance on diacritics. A textbook example of this type is Japanese kana , which actually encompasses two distinct but collaborating m oraic syllabaries known as katakana and hiragana (Rogers 2004). As in the case of the morphographic and phonographic superclasses, here too it is worth observing that absolutely pure abjads, alphabets, abugidas, or syllabaries are quite rare at best. Onc e again, even the oft used archetypes of these subtypes demonstrate some non canonical features. For example, although the Arabic code does not explicitly mark short vowels, long vowels are indicated by the glyphs otherwise used for the corresponding glid es, thus making it partially alphabetic. In education, Arabic employs a system of diacritics to signal the otherwise unwritten short vowels for easier learning, allowing it to effectively become an abugida. Unless otherwise clarified, then, the terms jus t discussed shall be used according to the dominant trends and underlying strategy of a script, regardless of any negligible deviations from the definitions provided. A parallel but more gradated parameter that may be invoked for categorical purposes is o rthographic depth. Although it is often loosely defined as the proportion of irregularity in an orthography, a more precise definition can be revealed in the work of Katz and Frost (1992). Strictly speaking, orthographic depth refers to where exactly on the generative chain from underlying (i.e. purely morphemic) to surface (i.e. purely phonemic) the orthography typically draws the basis for its rendition of the morpheme in question. A deep orthography prefers to maintain spell the same morpheme accordin g to the pronunciation of its most basic allomorph


16 even in positions where morphophonemic alternation renders that spelling inaccurate from a purely phonemic perspective. The consistent use of for the English past tense inflection /d/, including cases that devoice the /d/ into a generally contrastive /t/, is an example of a feature that is typical of a deep orthography. A shallow orthography, on the other hand, draws the basis for its representation closer to the surface, explicitly representing morph ophonemic alternants of the same morpheme. A shallower rendition of the past tense suffix in the English word "laughed," for instance, would be "laught." In some sense, the spectrum of orthographic depth mirrors the continuum between phonography and morph ography. In other words, a morphography could be thought of as the absolute "deepest" that a system could get, while a pure phonography could be called the absolute "shallowest." However, the notion of orthographic depth tends to be used to describe vary ing degrees of concession to morphology in what is principally a phonography. It can be difficult to measure beyond somewhat subjective estimates, but at least one intriguing method has been explored that put normalized corpora of written words and their respective pronunciations in multiple languages through an algorithm that uses both decision trees and probabilistic "learning." Orthographic depth was determined by measuring the accuracy of predictions made when novel words were fed into the program as well as how deep the trees tended to be for each language (Van Den Bosch et al. 1994). In the process of reading and decoding written text, two routes of lexical recognition are theorized. In phonological access, each written word (or morpheme) is deco ded into a phonemic representation which then serves as the key for which a match is sought in the mental lexicon. In "visual orthographic" access, the graphic representation itself is the key which is then matched to an item in a reader's visual memory. In every case, whichever route provides the fastest route to identifying the word is presumab ly the one that is applied. Factors such as frequency and reader


17 experience are argued to influence the relative prominence of these two mental strategies. For example, more frequent items may be more likely to evoke visual orthographic decoding than phonological, while novice readers probably rely more heavily on the latter. The orthographic depth hypothesis includes depth of spelling among those factors, posit ing a correlation in which phonologically mediated access is more dominant in shallow orthographies and yields much of its role to visual orthographic access in deeper ones. This claim is best tested comparatively on multiple languages (Katz and Frost 199 2). One such study was conducted by Seymour, Erskine, and Aro (2003). Elementary school children native to several European languages were tested on letter naming, reading known words aloud, and pronouncing nonce words. The languages in question were ca tegorized according to syllable complexity, with results compared both within and across the two classifications. The overall results suggested that shallower orthographies significantly accelerate the acquisition of literacy in languages like the ones st udied, especially for those that allow complex s yllables and consonant cluster Figure 2 1. Typology of Scripts


18 CHAPTER 3 ANATOMY OF AN ALPHABETIC ORTHOGRAPHY Much of Chapter 3 and Chapter 4 graphemes along three hierarchical dimensions: simple versus complex, proper versus improper, and monophonematic versus biphonematic. Simple graphemes are composed of one letter, while complex graphem es are composed of multiple letters. An improper grapheme is a silent one and can be further classified as true (i.e. completely irrelevant to pronunciation) or false (i.e. bearing some impact on the pronunciation of surrounding graphemes). Finally, as s graphoneme , the pairing of a grapheme with a phoneme (or sequence thereof) which lends no inherent primacy to either half. Graphonemes may be sorted according to homography or homophony (depending on how they are expressed linearly) in order to analyze their distribution. Graphonemic distribution may be positional (dependent o n neighboring segments), lexical (dependent on lexical or morphemic identity), or contextua l ( dependent on morphophonology ). Finally, orthographies can also be compared to four idealized archetypes. First, a phonological system is purely isomorphic, with every grapheme representing one and only one phoneme and every phoneme represented by one and only one grapheme. Second, a monophonematic system is one in which every grapheme has only one possible pronunciation but not every phoneme has only one possibl e spelling. Third, a monographematic system is one in which every phoneme has only one possible spelling but not every grapheme has only one pos sible pronunciation. A fourth type, called heterophonematic and heterographem atic, has both homography and hom


19 biphonematic grapheme" is defined alternativ ely as a "polyphone," while the original coinage "raw correspondence" seems pra ctically synonymous with "gra mention of "positional" distribution foreshadows the concept of "positional rule" to be ela borated more fully here. In addition , the four orthographic archetypes, particularly the second and third, allude to the distinction between the parameters of "phonemicity" and "graphemicity" which will be introduced in Chapter 4 . Correspondences and Rules An alphabetic orthogra phy may essentially be decomposed into two major components. Raw correspondences map a language's phonemes to the available graphemes, ideally in an isomorphic way. In most cases, each phoneme will be assigned to a single, irreducible symbol called a mon ograph . If, for historical or social reasons, there are more phonemes than there are available graphemes, one or more polygraphs may be used. A polygraph is a fixed combination of monographs that a reader interprets as encoding a single phoneme which is independent from the one encoded by either component individually. Examples include digraphs such as or trigraphs such as , which usually repres ent / / in English and German, respectively. The converse of a polygraph is a polyphone , which is a single monograph that singularly stands for an entire sequence of phonemes (Rogers 2004). The most familiar example of a polyphonic correspondence is like ly the association of /ks/ and/or /gz/ with in English and some other Indo European tongues. Any correspondence can be stated formally from either a decoding (i.e. reader's) perspective or an encoding (i.e. writer's) perspective.


20 (1) Sample Dec oding Correspondences (2) Sample Encoding Correspondences (a) = /j/ (a) /j/ = (b) = / / (b) / / = (c) = /ks/, /gz/ (c) /ks/ = (d) /gz/ = Here already we have an example of a phonemic ambiguity , which manifests itself either as a decoding correspondence with multiple phonemes on the right side of its expression or multi ple encoding correspondences with the same written symbol(s) on the respective right sides of their expressions. It is not uncommon for such polyvalences to be the raw material on which the second main component of an alphabetic orthography operates. Tha t supplementary module is a system of positional rules. A positional rule is a rule by which one or more possible mappings of sound to symbol or vice versa can be eliminated based on the relevant grapheme or phoneme's position relative to othe r graphemes, other phonemes, word edges , and/or contrastive prosodic features . One example from a decoding perspective in Italian is given below. ] / {<_e>, <_i>}, [k] / elsewhere / when followed by or and as /k/ in all other positions. This positional rule clarifies ambiguities created by the /, /k/. Orthographies like the Italian standard derive their reputed consistency not strictly through isomorphic correspondences but rather through the synergy of raw correspondences and positional rules. A related example fr om an encoding p erspective is presented in (4).


21 / {[_i], [_e]}, / elsewhere The rule expressed above is that /k/ is spelled before the front vowels /i/ and /e/ and in all other positions. At this point, the reader may notic e that (3) and (4) seem suspiciously complementary, and indeed, there is a remarkably puzzle piece like quality to the orthographic system of Italian, which has some uncanny analogues in Spanish as well. A more thorough examination of both of these langua ges' spelling codes presents us with an excellent example of what can be done via the interaction of raw correspondences and positional rules, and so, it is only fitting to proceed with just such an in depth look at these two revealing cases. Italian and S panish The sample rules presented in the preceding section are part of a very cohesive system that provides fascinating insight into how raw correspondences and positional rules can interact. In Italian, the monographs and each have what is refe rred to in traditional pedagogy as a "soft" and a "hard" pronunciation. The former is used before the front vowels /i/ and /e/, with the latter being the elsewhere case. This can be stated with the following raw correspondences and positional rules, writ ten from a decoding perspective. /, /k/ /, /g/ ] / {[_e], [_i]}, [k] / elsewhere ] / {[_e], [_i]}, [g] / elsewhere The important consequence here is that the phonemic ambiguity of and identified in (5) never surfaces in actual text, because the rules in (6) allow the reader to reliably predict how any


22 instance of either letter should be pronounced based on its immediate surroundings. However, whenever an Italian needs to spell a word containing a phoneme sequence such as /ke/ or /gi/, a potential conflict arises. Since there is no other monograph that can represent /k/ or /g/, the writer would be compelled to write or (/e/ and /i/ are written just as they are in the i/. So how would an Italian author encode /ke/ or /gi/? The solution lies in two more positional rules that arguably operate on a correspondence that exists independe ntly (that of a silent ). This time, the rules in question are perhaps best expressed from the encodi ng perspective. (7) = Ø The correct spelling of /ke/ or /gi/ in Italian is therefore or respectively. The , which is consistently silent in any case, functions as a buffer letter (a letter positioned to block the effects of a positional rule) and forms a digraph with the preceding con sonant. In essence, positional rules (6a) and (6b) filter out a phonemic ambiguity from ever surfacing, but they also make some phonotactically permissible sequences impossible to spell. Positional rules (8) and (9) then rectify that gap. The conclusion to be drawn from this is that clarifying an ambivalence wrought by an anisomorphic set of raw correspondences is not the only function that a positional rule can serve. It may also relieve the side effects of another rule. It would be a mistake to end o ur investigation here, however. Further exploration reveals that the system is, in one important sense, bidirectional. We have seen that and are the only two vowels that trigger the affricate pronunciations of otherwise plosive and , but i n the above examples ( and ), the respective vowels' primary purpose was to represent


23 independent phonemes, with the altered pronunciation of the preceding consonant being merely incidental. As it turns out, this need not always be the case. One of these vowel letters, namely , can essentially serve a role converse to that of the in or via the following positional rule. 1 / have no available representations other than and , but those respective pronunciations are only available before or . The above rule, co o/ to be clearly represented as or . Crucially, the itself is unpronounced in such spellings, having no phonological purpose. Its only role is to create the orthog raphic conditions for (6a) or (6b) to apply. It may also be worth noting that the pronunciation of remains perfectly predictable, since one can always know whether or not to pronounce the based on its immediate surroundings. Italian's close siste r, Spanish, has a strikingly similar system of positional rules through which certain graphemes or diacritics may feed or bleed the application of some of those rules. As in Italian, the letters and each have two different possible pronunciations, one of which is always eliminated by the identity of the phoneme to its immediate right. (11b) = /x/, /g/ (11c) = /x/ / {[_e], [_i]}, [g] / elsewhere 1 It should be noted that this rule is not, in fact, absolute, as shown by words such as tecnologia which the is pronounced. It is possible that consideration of stress may allow this to be further regular ized, but in any case, such exception s are abstracted away for the purpose of illustration.


24 Also as in the Italian case, this arrangement creates a grapholexical gap , or group of possible phoneme sequences that would become unspellable if not facilitated by further rules. Either of the very same examples from Italian, /ke/ or / gi/, can also illustrate the Spanish system. Without any other monograph available for /k/ or /g/, a Spanish writer would be compelled to spell or , but according to the rules in (12), these respectively. Spanish resolves this conundrum with the rules in the following fashion. [g_i]}, [u] elsewhere Rule (13 ) expresses an outright substitution of t he fixed digraph , while (14 ) allows to serve a role analogous to that of the in Italian . However, contrary to the Italian , which is always silent, the default value of Spanish is not null, thus nece s sitating rule (15 ) to allow the prediction of when it should be pronounced versus when it should be silent. A further wrinkle occurs via the existence of sequences such as /gui/. Again, a Spanish writer would be force d to encode this as , since there is no other way to spell the vowels /u/ or /i/. As one might expect by now, though, this would be interpreted by the reader as /gi/. For such cases, Spanish employs a diacritic to signal the suspension of rule (15) and the retreat to the default pronunciation of .


25 Meanwhile, an important difference from the Italian model arises with correspondences (11c) and (11d), which facilitate the spelling of sequences such a analo gous to the in Italian . We can observe these rules interacting through some morphographological alternations found in certain forms of particular Spanish verbs. For instance, the root averigu (meaning "to find out") yields the infinitive averigu ar (/abe /), but the first person singular preterite is averigü é (/abe ar in the former does not trigger rule (15), but the suffix é in the latt er does, and it then requires the application of rule (16) to prevent the in the root from being silenced. Another example is the alternation of the verb whose infinitive is escog er but whose first person singular present indicative is written as esco jo . By rule (12b), the in the former is rightly pronounced /x/, but the very same root final consonant is spelled in the latter with a instead as per correspondence (11c). This is because * escogo same root pronunciation across all inflected forms. A final example is the verb alcanz ("to reach"), which becomes alcanz ar in the infinitive but alcanc é in the first person sing ular > and are in what could nearly be called complementary distribution, as are and in representing /x/. Spanish is often held up as a prototypical example of a highly consistent orthography, and indeed, the pronunciation of virtually all Spanish words can be reliably predicted from their spellings. The aim of this examination was to reveal how that oft cited consistency is generated by raw correspondences and particularly by positional rules. As shown by the correspondences in (11), the Spanish alphabet is not absolutely isomorphic in its associations between graphemes


26 and phonemes, but supplementary bylaws such as (1 2a) and (12b ) prevent the few ambiguities that exist therein from ever surfa cing in any text. In this way, Spanish orthography maintains a high degree of regularity with fewer characters and/or digraphs than it would otherwise need, a ttain the same level of predictability. The very same can be said for written Italian. While the duality of and in both languages is likely a consequence of historical sound changes (in Latin, these letters only ever stood for stops), those sound changes were regular enough that it was probably quite natural to adapt them into orthographic regularities expressed by the rules just explained. As a result of this exploration and the proposed formalism for describing orthographic structure, however, the definition of phonemic consistency becomes a bit more elaborate. The phrase could refer either to an orthography's proximity to strict isomorphism (i.e. how closely a system emulates the ideal of one grapheme per phoneme and vice versa) or to the capa city of its positional rules to filter out whatever ambivalences there may be in its raw correspondences. In attempting to measure consistency, it could potentially be helpful to formalize the differentiation between these two senses of the term and provi de a way of evaluating an orthography according to either one. For example, while positional rules are powerful, relying on too many of them can hinder orthographic acquisition, and one way to estimate the complexity of a system is to assess its regularit y both with and without consideration of positional rules and then examine the difference in the resulting ratings. A significant gulf between the two could suggest that the orthography relies too much on an abundance of such rules and might be too diffic ult to acquire with the desired speed and ease. In any case, the forthcoming chapter discusses a m ethodology for measurement that can facilita te such analysis and comparison.


27 CHAPTER 4 MEASURING PHONEMICITY AND GRAPHEMICITY Towards the end of the prece ding section, two different senses of the term "consistency" were identified. Each interpretation will now be assigned its own term. Raw phonemicity shall refer to the extent to which every unique grapheme is associated with a single phoneme, without any consideration to positional rules. Adjusted phonemicity shall encompass raw phonemicity but also take positional rules into account, measuring the predictability of pronunciation after all such rules have been brought to bear on whatever ambivalences the re are in the correspondences (i.e. lapses in raw phonemicity). It may help to think of the former as the predictability of how a symbol may be pronounced individually without any environmental cues and the latter as either the predictability of how a com plete written word is pronounced or how each symbol sounds in running text. Consider again the polyvalence of and in Spanish or Italian. For either language, each of those ambivalences constitutes a reduction in raw phonemicity, but positional ru les nullify their effect on adjusted phonemicity. For instance, the grapheme may represent either /g/ or /x/ in Spanish, but no word containing that glyph will ever leave the reader guessing, because the relevant rules always allow one or the other as sociated phoneme to be eliminated as an appropriate interpretation in any given orthographic environment. This is still not the complete picture, since both raw and adjusted phonemicity address the orthography strictly from the decoding perspective. I n other words, both parameters measure the estimated number of decisions and/or discriminations that the reader must make in deciphering text. Now, what about the writer? From the encoding perspective, one can invoke the parallel notion of graphemicity , which conversely measures the estimated number of decisions and/or discriminations that the writer must make in producing text. Graphemicity can analogously be


28 approached in terms of raw graphemicity or adjusted graphemicity . These terms are defined simi larly to the equivalent measures of phonemicity (i.e. the inclusion or exclusion of positional rules in the equation). It is tempting to think of phonemicity and graphemicity as essentially two sides of the same coin that will inevitably be equal in the r esults of any orthographic analysis. However, this would be a hasty hypothesis. Consider, for example, a hypothetical code in which and may both be used to stand for /k/. Any time a reader sees either letter, he/she will readily know to pronounc e it /k/, so the ambivalence is no detraction in phonemicity. Anytime a writer wants to spell a word containing /k/, on the other hand, he/she may not necessarily know (at least not based on purely phonological criteria) whether to use or , and so this does constitute a detraction in graphemicity. For a more concrete example, one need only consider the very real use of silent in Spanish. A consequence of having any grapheme that never independently stands for any phoneme is that many other pho nemes acquire two possible spellings each, one with and one without the perpetually unpronounced symbol. A Spanish reader may easily know to pronounce as /e/ or as /ai/, but whenever a Spanish writer wants to encode those same sounds, he must q uite often choose between and just or and just . Raw graphemicity drops sharply as a result, and in this case, so does adjusted graphemicity, since positional rules are little if any help. This is beca use there are not enough that qual ify. F or analytical purposes, a rule can only be counted if mo rphemic or lexical identity plays no part in the pattern . It must refer purely to phonologi cal and/or graphological criteria . The Spanish frequently serves no phonological purpose but ins tead provides an element of morphography, distinguishing certain pairs of homophones, such as ay , an interjection, and hay , meaning "there is/are." Of course, could be argued to contribute phonological information when it forms


29 part of the digraph ap pears in places where no buffer letter is necessary or applicable. Again, its function is instead to distinguish homophones, a clearly morphographic role. One very practical way to arrive at a rough estimate of adjusted phonemicity is to take a sample pa ragraph and calculate the average number of possible pronunciations per word. Such a passage would preferably, though not necessarily, contain at least one instance of every phoneme in the language. The reciprocal of the resulting ratio, expressed as a p ercentage, objectively rates the predictability of word pronunciation. As an example, let us examine the following paragraph of written Spanish and derive an estimate of the orth ography's adjusted phonemicity. Es quizás una de las mejores ironías de la h istoria que a la cultura medio oriental, con que la nuestra tiene larga historia de enemistad, debemos mucho del cuerpo literario e intelectual que hoy en día forma gran parte de la tradición occidental. Con la conquista de Granada en 1492, se terminó por fin el dominio de los moros sobre casi toda la península ibérica que se había establecido casi siete siglos antes, pero su gran influencia cultural nunca se pudo extinguir. Que no subestimemos el poder simbólico de los mozárabes que, por medio de simplem ente existir, demostraban que sí se podían convivir, hasta dentro del individuo sí mismo, la fe católica y la sabiduría árabe. Of these 114 words, one has two possible pronunciations. The letter is generally pronounced /ks/ in Spanish, but some lexe mes borrowed from indigenous Central and South American cultures use it instead to represent /x/. The pronunciation of cannot be deduced both of which it occurs intervocalically between a stressed syllable on its left and an unstressed one on its right but still has a different phonemic value in each word. While in extinguir , the immediately following /t/ likely renders /ks/ the only phon otactically


30 permissible interpretation, there are no such phonotactic predictors in existir , which could /. With all other words having only one possible realization each, this yields a total of 115 p ossible pronunciations drawn from 114 written words. The quotient of 114/115 is 0.99, meaing that Spanish achieves an adjusted phonemicity rating of 99%. The converse parameter, that of adjusted graphemicity, may also be estimated in this way, presumably starting with an IPA transcription of the text. Here the silent exerts its greatest impact. Assuming for the sake of argument that it can only occur word initially, this means that any vowel initial word has two possible renderings. Hence, a total of 27 words are affected. Es could plausibly be spelled , una could plausibly be spelled , ironía could plausibly be spelled , etc. Subtracting 27 from 114 gives us 87 words with only one plausible spelling. Adding 87 and 54 (27 ambi valent words with 2 possible spellings each) yields a total of 141 plausible spellings. The quotient of 114/141 is 0.81, resulting in an adjusted graphemicity rating of 81%. A single silent letter, then, has had a very significant impact on the overall g raphemicity of Spanish orthography. As we will soon see, its effect is revealed to be even greater when we apply a less rough and ready and more precise methodology. That methodology comes in the form of tabulation. Depending on whether phonemicity or g raphemicity is being examined, one begins with either all of the phonemes in the language or all of the available symbols and lists them in the first column. Any and all associated glyphs or sounds for each one, regardless of position, are then listed to in the second column, and the number of such associations is recorded in the third. The fourth column identifies how many associations can be eliminated by positional rules, and the fifth column identifies how many associations can ever be left after the application of all relevant positional rules.


31 From this tabulation, several figures can then be calculated. The average of column three yields raw phonemicity or graphemicity, while that of column five produces the adjusted ratings. The sum of column fo ur can provide a comparative estimate of how many rules the system relies on. The table of associations itself provides a formal at a glance summary of the orthography in question, and average phoneme to grapheme or grapheme to phoneme ratios may be calcu lated either directly or by inverting the percentage ratings. In this way, the target spelling system can be objectively assessed for consistency, which can greatly expedite the comparison of competing proposals. Phonemicity and graphemicity may themselv es be averaged together into one single number estimating what one might simply call the "consistency" of a particular code. While the results generated by this methodology should not necessarily be viewed as decisive evaluations of an orthography's overa ll merit, having such clear and definite rankings along strictly linguistic dimensions means that we can proceed sooner and more smoothly with the consideration of other factors without the potential for derailment arising from more subjective or otherwise imprecise estimations of those purely technical parameters. The following tabulations assess the phonemicity and graphemicity of standard S pa nish orthography. In interpreting Table 4 1 and Table 4 2, it wil l prove important to remember that , in order fo r a rule to contribute to the "Eliminable" figure for each sound or symbol, it must be completely inviolable according to phonological or graphological (i.e. not morphological or lexical) context . A n equally c rucial observation, however, may be the diverg ence between the estimates produc ed by the sample paragraph analysis above and these supposedly more precise calculations. The difference between, for instance, 99% and 81% for phonemicity or between 81% and 65% for graphemicity seem large enough to merit some explanation.


32 Table 4 1 . Phonemicity Tabulation for Spanish Orthography Symbol Sound(s) # of Sounds Eliminable Difference a a 1 0 1 á a 1 0 1 b b 1 0 1 c 2 1 1 d d 1 0 1 e e 1 0 1 é e 1 0 1 f f 1 0 1 g g, x 2 1 1 h Ø 1 0 1 i i 1 0 1 í i 1 0 1 j x 1 0 1 l l 1 0 1 m m 1 0 1 n n 1 0 1 o o 1 0 1 ó o 1 0 1 p p 1 0 1 r 1 0 1 s s 1 0 1 t t 1 0 1 u u, Ø 2 1 1 ú u 1 0 1 ü u 1 0 1 v b 1 0 1 x ks, x 2 0 2 y , i 2 1 1 z 1 0 1 ñ 1 0 1 ch 1 0 1 ll 1 0 1 qu k 1 0 1 rr r 1 0 1 ai ai 1 0 1 ái ai 1 0 1 au au 1 0 1 áu au 1 0 1 ay ai 1 0 1


33 Table 4 1. Continued Symbol Sound(s) # of Sounds Eliminable Difference áy ai 1 0 1 ei ei 1 0 1 éi ei 1 0 1 eu eu 1 0 1 éu eu 1 0 1 ey ei 1 0 1 éy ei 1 0 1 oi oi 1 0 1 ói oi 1 0 1 oy oi 1 0 1 óy oi 1 0 1 ou ou 1 0 1 ó u ou 1 0 1 Total Symbols Raw Total: 57 Adjusted Total: 53 52 Raw Difference: 5 Adjusted Difference: 1 Total Rules Raw Ratio: 1.09 Adjusted Ratio: 1.02 4 Raw Phonemicity: 91.23 % Adjusted Phonemicity: 98 .11 % Note: For the pur poses of this demonstration, the rising diphthongs, such as the [je] in bien ("well") or the [wa ] in cuál ("which "), have be en assumed to be surface gliding realizatio ns of underlying VV sequences, suc h as /ie/ and /ua/ respectively, in which an unstressed high vowel immediately precedes any other vowel. Table 4 2 . Graphemicity Ta bulation for Spanish Orthography Sound Symbol(s) # of Symbols Eliminable Difference a a, á, ha, há 4 1 3 b b, v 2 0 2 d d 1 0 1 e e, é, he, hé 4 1 3 f f 1 0 1 g g 1 0 1 i i, í, hi, hí, y 5 1 4 y 1 0 1 k c, qu 2 1 1 l l 1 0 1 m m 1 0 1 n n 1 0 1


34 Table 4 2. Continued Sound Symbol(s) # of Symbols Eliminable Difference ñ 1 0 1 o o, ó, ho, hó 4 1 3 p p 1 0 1 r rr 1 0 1 r 1 0 1 s s 1 0 1 t t 1 0 1 z, c 2 0 2 ch 1 0 1 u u, ú, ü, hu, hú 5 2 3 x j, g, x 3 0 3 ll 1 0 1 ai ai, ay, ái, áy, hai, hay, hái, háy 8 6 2 au au, áu, hau, háu 4 2 2 ei ei, ey, éi, éy, hei, hey, héi, héy 8 6 2 eu eu, éu, heu, héu 4 2 2 oi oi, oy, ói, óy, h oi, hoy, hói, hóy 8 6 2 ou ou, óu, hou, hóu 4 2 2 Total Sounds Raw Total: 82 Adjusted Total: 51 30 Raw Difference: 52 Adjusted Difference: 21 Total Rules Raw Ratio: 2.733333333 Adjusted Ratio: 1. 7 36 Raw Graphemicity: 36.59% Adjusted Graphemicity: 58. 82 % Note: The "Eliminable " figure for each vowel would be higher if acute diacritics were only ever used to mark irregular stress, but while stress is very nearly a reliable predictor, there are a few cases of phonologically redundant acutes serving to di stinguish homopho nes . Examples include ("yes") versus ("if") and (2SG.NOM) versus (2SG.POSS). T he figure of 0 for for instance, while can only be /x / before or , is not complementarily prohibited from being used in that same environment. For example, the prescribed rendition of ("boss") is , but it could just as plausibly be spelled . That explanation lies in the fact that the bel ow tables and the resulting figures fail to take lexicon proportionality and token frequency into account. Lexicon proportionality refers to the proportion of a language's total lexicon that may be affected by a particular ambivalence. Token


35 frequency re fers to how often a particular word or association of sound with symbol is typically used. By its very nature, the analysis of a random text sample at least takes token frequency into account. In that methodology, ambivalences relevant to very rare words will accordingly present very few if any instances through which they might otherwise contribute to a lower average ratio of spelling to pronunciation or vice versa. On the other hand, precisely due to their statistical improbability in a randomly select ed text, such sparsely used lexemes may not appear at all, which would lead to a slightly inaccurate result. Whether that slight inaccuracy is negligible depends on the context and purpose of measurement, but in any case, the tabulation method is better a t capturing exactly this sort of seldom occurring ambiguities. However, it treats them no differently than the more common ones, which can result in phonemicity or graphemicity ratings significantly lower than what the system in question actually achieves in everyday practice. One solution is to assign each association in an unweighted tabulation such as Table 4 1 or Table 4 2 a non whole number that reflects its token frequency and/or lexicon proportionality. A particularly rare association, for instanc e one that only arises in 25% of words containing the pertinent grapheme or phoneme, may be counted as 0.25 instead of 1 for the purpose of third column summation. The fifth column calculation would then need to be adjusted so that its value never dips be low 1, which would anomalously imply one whole symbol being tied to a fraction of a sound or vice versa. Rather than simply subtracting the fourth column value from the third column value, it would take the result of such a subtraction and put it through a ceiling function (an operation in which the larger of two input numbers is yielded as output) alongside 1. The result is a weighted tabulation , whic h can be exemplified by Table 4 3, an adjustment to Table 4 1 in which /x/ is assumed for the sake of dem onstration to have a weight of 0.1 as a possible interpretation of .


36 Table 4 3 . Phonemicity Tabulation for Spanish Orthography with Weighted Values for Symbol Sound(s) # of Sounds Eliminable Difference a a 1 0 1 á a 1 0 1 b b 1 0 1 c 2 1 1 d d 1 0 1 e e 1 0 1 é e 1 0 1 f f 1 0 1 g g, x 2 1 1 h Ø 1 0 1 i i 1 0 1 í i 1 0 1 j x 1 0 1 l l 1 0 1 m m 1 0 1 n n 1 0 1 o p 1 0 1 ó o 1 0 1 p p 1 0 1 r 1 0 1 s s 1 0 1 t t 1 0 1 u u, Ø 2 1 1 ú u 1 0 1 ü u 1 0 1 v b 1 0 1 x ks, x 1.1 0 1.1 y , i 2 1 1 z 1 0 1 ñ 1 0 1 ch 1 0 1 ll 1 0 1 qu k 1 0 1 rr r 1 0 1 ai ai 1 0 1 ái ai 1 0 1 au au 1 0 1 áu au 1 0 1


37 Table 4 3. Continued Symbol Sound(s) # of Sounds Eliminable Difference ay ai 1 0 1 áy ai 1 0 1 ei ei 1 0 1 éi ei 1 0 1 eu eu 1 0 1 éu eu 1 0 1 ey ei 1 0 1 éy ei 1 0 1 oi oi 1 0 1 ói oi 1 0 1 oy oi 1 0 1 óy oi 1 0 1 ou ou 1 0 1 ó u ou 1 0 1 Total Symbols Raw Total: 56 .1 Adjusted Total: 52 .1 52 Raw Difference: 4.1 Adjusted Difference: 0. 1 Total Rules Raw Ratio: 1.08 Adjusted Ratio: 1.002 4 Raw Phonemicity: 92.69 % Adjusted Phonemicity: 99.81 % At this point, since one of the key purposes of this work is to provide a useful framework and toolkit for linguists developing entirely new orth ographies for hitherto unwritten languages, one might ask if positional rules are tru ly relevant to that task. It is one thing to acknowledge them in traditional orthographies whose workings are as much a result of historical whims and unscientific prejud ices as they are of deliberate planning (if not more so). It is quite another to potentially create and use similar rules in the far more deliberately and scientifically designed writing systems being invented by contemporary linguists. Would it not be s impler overall to eschew the positional module altogether and rely on completely isomorphic correspondences? Such a move would not only collapse the raw versus adjusted rating dichotomy into a single measurement, but it would also omit the need to disting uish phonemicity from graphemicity,


38 since the two would inevitably emerge as exactly equal in such an ideally isomorphic system. Also, as has already been stated, an excess of positional rules, regardless of their functional success at filtering out corre spondential ambivalences, could slow down the acquisition of literacy. Why then, when presented with the opportunity to design an orthography from the ground up, would a modern linguist invoke such a seemingly complicating device? The answer was hinted a t earlier when it was observed that, without positional rules, Spanish might need entirely new symbols to repre occupied already by /k/ and /g/ and no other existing graphemes available. Even the more makeshift solution to any deficit of graphemes with respect to a particular phoneme roster, namely the use of digraphs such a /, belies the restriction that positional rules may best address. That limit lies in the number of symbols available to be utilized. As a descendant speakers where hesitant to expand or alter. The retention of , which was pronounced in Latin, is at least partially a consequence of that same ancestry as well. Reverence for tradition and/or a sense of complacency inhibited the Spaniards from expandi ng their hereditary glyph roster as much as a trained linguist might have recommended. These are social factors, and that is the final point to be made here. In developing an orthography, a linguist often does not have total control over the design. Soc io cultural influences can and often do make or break a writing system, at least in terms of its diffusion among and usage by the people for whom it was presumably intended. Positional rules, then, provide a means by which such cultural factors might be a ddressed at minimal cost to the clarity and precision of the system in development. They can be especially potent when combined with phonotactics. For example, if the speech community is annoyed by


39 the peppering of frequent diacritics, perhaps omitting t hem if the pertinent phonemic contrasts are not deemed worth the extra marking, then the linguist may examine the phonotactics of the language. If a diacritic marks a phonemic contrast that is neutralized in particular environments, then such environments may serve as the conditions for a positional rule that noticeably reduces the occurrence of that diacritic, with the proviso that those conditions are simple enough to be expressed and therefore taught without the use of obscure linguistic jargon. Striki ng the balance between maximal learnability and the avoidance of redundancy is no easy feat, and it may matter more (or even in different ways) in some situations than in others. Still, the degree to which such strategies become important and the nature o f that importance are just two among many ways in which cultural considerations can impact orthographic structure, which is why the next chapter focuses on a more in depth look at those sociolinguistic factors.


40 CHAPTER 5 A SOCIO CUL TURAL PERSPECTIVE The pool of available graphemes and the frequency of certain diacritics is only the tip of the iceberg that comprises the socio cultural issues that can often prove crucial in the design of a new orthography. Due to the largely scienti fic nature of their training, with its main emphasis on the parsimonious synthesis of generative accounts from raw data, a typical linguist can be easily tempted to build a spelling system based on methodological analysis of the relevant language's phonolo gy. To a large extent, the success at doing so is what the techniques presented in the preceding section seek to measure. However, this purely functional outlook on orthography sometimes fails to mesh smoothly with the host speech community's own percept ions of their language, any previously established habits or prejudices, and/or the intended spheres of usage for the proposed code. The choice of graphemes and/or even the conventions for arranging them can often have strong connotations regarding region al, national, or even religious identity. Most linguists, for instance, agree that Hindi and Urdu are two dialects of the same language that only diverge in sufficiently high registers. Yet the former is written in Devanagari by Hindus, while Muslims in sist on using the Arabic script, and the distinction in written form is cited by users as justification for labeling them as separate languages. Furthermore, to a linguist with insufficient awareness of potential connotations, assigning to the voicele ss velar stop may seem like just as viable of a choice as , but the community whose language he/she seeks to represent in writing may associate with Anglophone or otherwise Germanic culture and view it as an infringement, in which case might be the better option. For a more striking example, one need only consider the ousting of the Cyrillic script in favor of the Roman script in Moldavia following its 1989 secession from the Soviet Union. The


41 irony is that Cyrillic boasted a greater number of characters with which to uniquely represent all phonemes of Moldavian, for which the Latin repertoire had an actual deficit of symbols. From a purely linguistic perspective, then, it was not an optimal decision. Still, the re Romanization of Moldavian re asserted the new nation's independence from Russia, allowed for far more natural interchange of literature with their linguistic brethren in Romania, and also paid homage to the language's Latinate origins as a member of the Romance family. Indeed, lingui stic and socio cultural drives could arguably be called downright antithetical to each other in some cases. An activist in the Malay Indonesian orthographic disputes of the 20 th century even commented once that the main proposal under consideration at tha t time was "linguistically speaking so good that it could not be prescribed at all to the public" (Sebba 2007). In 1953, Heinz Kloss introduced the terms abstand and ausbau to differentiate languages distinguished by their inherent properties from those c onstructed by sociological elaboration. An abstand language, according to Kloss, is defined by its "intrinsic distance" from all others, while an ausbau language is defined by its having been culturally "reshaped," most often by means of a distinct litera ry tradition. No linguist would hesitate to classify two or more abstand entities as separate languages based purely on the spoken forms. The classification of two or more ausbau entities, however, may derive much or even most of its justification from t he respective written forms (Kloss 1967). The aforementioned example of Hindi and Urdu can easily be thought of as a prototypical case of ausbau languages, while Hindi and Greek, for instance, are clearly abstand languages. Kloss describes abstand as a l inguistic parameter and ausbau as mainly a sociological one, but when these terms are adapted for the discussion of orth ography itself as in Sebba (2007 ), the former also assumes a sociological dimension. In this discourse, abstand has come to mean


42 symbol ic differentiation from the written language of an invasive culture as a means of national self assertion, and ausbau refers to the symbolic emulation of an established prestige language (usually that of the same dominating foreigners) for the sake of its respectability. Some of the greatest insights into the importance of socio cultural aspects to spelling may be gleaned from examining the development of new national orthographies for various local languages in post colonial Africa, Southeast Asia, or Sou th America. There, the opposing ideals of abstand and ausbau often played a large role in rendering contentious what might otherwise have been a fairly straig htforward project (Sebba 2007). Lessons from Post Colonial History For Sranan, an English based creole with some Dutch influences spoken in Surinam, one of the first orthographic traditions was developed in 1783 by Schumann, a Moravian missionary from Germany. By that time, Dutch had been the language of prestige in the region for several decades, b ut Schumann's purpose was to provide a practical dictionary and pronunciation guide for his fellow German speaking missionaries in order to better communicate the scriptures orally to their intended converts, and the system thus relied primarily on German spelling conventions. In the 1820s, however, Dutch patterns began to become more prevalent as the clergy increasingly focused on reaching congregants who were literate in Dutch (if they were literate at all). Though often deferring to etymology rather th an contemporary phonology, this second orthography was effectively standardized by its use in the 1829 translation of the New Testament, and it survived with very little modification into the mid 20 th century. Meanwhile, despite a relatively fleeting peri od of increasing allowances for education in Sranan starting in 1844, by 1877, the trend had been essentially reversed in favor of a policy that favored Dutch literacy only. Standard written Sranan was thus exiled from schools, and Surinam became a very e xographic country in which the people spoke Sranan but wrote Dutch. The Dutch based standard for written Sranan became


43 relegated to liturgical spheres, leaving secular Sranan writers to use their own makeshift codes. (Sebba 2007). In the 1950s, the Nethe rlands Bible Society concluded that the time was ripe for a new Sranan Bible edition that would be easily legible by the masses. They sent the Dutch linguist Jan Voorhoeve to Surinam to facilitate the project. There, Voorhoeve found a multitude of mutual ly unintelligible orthographic codes in use by various Sranan writers and their supporters. In 1957, he managed to gather all of the interested parties and oversaw the compromise which produced a unified proposal. The third major Sranan orthography maint ained some uniquely Dutch features, such as the spelling of /u/ as , but dispensed with the numerous deferences to etymology in favor of a much more rigidly phonemic approach. Subsequent tests of the new proposed system elicited complaints that it evo ked a spoken form that diverged too greatly from the ecclesiastical tradition. It turned out that, in the interim during which written Sranan was restricted to liturgy, a distinct spoken variety had emerged based on pronouncing the many etymological spell ings of the older code as if they were more phonemic. This dialect of spelling pronunciations had grown to be closely tied to and indexical of the church and religious practice, despite its obtuseness with respect to everyday Sranan speakers (Sebba 2007). Despite these complaints, the colonial authority in Surinam approved an official standard that closely resembled Voorhoeve's proposal in 1960, and it remained official until after Surinam gained its independence in 1975. In the 1980s, a new government d elegation called the Kramp Commission was tasked with resolving dissatisfaction with the official orthography from 1960. The result was a new system introduced in 1986 that took some relatively radical steps, most notably abandoning some particularly Dutc h features in favor of associations that were more common internationally. The phonemes /j/ and /u/ were re assigned to the more universal


44 and respectively, ousting the traditionally distinctive and representations inherited from the Neth erlands. The Kramp proposal sparked debates between two main opposing parties. Some argued that a divorce from Dutch spelling patterns would make it difficult for children to transition from Sranan literacy to Dutch literacy, the latter of which was stil l the national written standard. Others argued that not only would the reforms enable written Sranan to interface more intuitively with most other Latin script languages in the world, but they would also reflect and reify the re definition of Surinam's re lationship with its former imperial ruler. In this case, then, socio cultural factors transcended even the national sphere and dealt largely in international terms. The 1986 decision of the Kramp Commission to ratify the internationalist proposal marked a victory for global pressures and a "symbolic rejection of Dutch colonial mastery" (Sebba 2007). Somewhat similar tensions characterize the orthographic history of the French lexicon creole spoken by virtually all native Haitians, most of whom are monoli ngual. The first attempts at writing it were characterized by dialectization , which is broadly defined as the sociolinguistic perception of an arguably separate language as just a dialect of a more prestigious relative (Kloss 1967). As is often the case for creoles, Haitian Creole was dialectized under its principal lexifier and thus written using a mildly modified form of standard French orthography, as exemplified by an anthology of fables that demonstrated the vernacular's readiness for literary usage. The use of French spelling conventions continued until about 1940, when Creole based literacy campaigns generated the need for a distinct orthography. The first response to that need was developed by Ormonde McConnell, an Irish missionary, and revised i n 1943 by Frank Laubach, an American literacy expert. The phonemic McConnell Laubach orthography was essentially standardized under the regime of Elie Lescot and its literacy campaigns, but by 1953, it had largely fallen out


45 of use. The reason was that t he educated elite of the time saw Creole as subordinate to standard French, and they consequently believed that literacy in the local vernacular should not by itself be the goal but instead serve only as a mere bridge towards literacy in French. To those ends, these literates wanted the orthographic conventions to better resemble French patterns (Sebba 2007). A further revision to the McConnel Leibach system was developed by C. F. Pressoir in order to appease the conservative Gallophiles, with features su ch as the use of silent instead of circumflex diacritics to mark nasal vowels and other typically French mechanisms. However, as policy shifted focus from adult to juvenile literacy training during the 1970s, that was replaced yet again by an all new and highly phonemic orthography designed jointly by French and Haitian linguists, which was officiated in schools in 1979. Since then, a tripartite politicization of Creole orthographic debates has emerged. Motivated by the persistent belief in the ultim ate ascendency and prestige of standard French, pro etymologists prefer a system that resembles that standard as closely as possible in order to better serve as a proverbial stepping stone towards French literacy. On the other hand, the pro phonemicists pr efer a system that reliably represents the sounds of spoken Haitian Creole. Meanwhile, an intermediate stance says that pronunciation should be faithfully encoded, but such consistency should be accomplished via characteristically French correspondences t hat are simply distributed in a more predictable fashion (Sebba 2007). Although there are some practical considerations, such as the frequency and/or variety of diacritics which may slow the task of writing or especially typing, these orthographic debates are mainly driven by ideologies, which are rooted in and reflective of the social inequalities between the educated elite and the general populace as well as the linguistic inequalities between Haitian Creole and standard French. The latter enjoys a pres tige instituted first by French clergymen and


46 later by an indigenous ruling class who both admired and despised France. Supporters of a more etymological orthography insist that French shall always be an important language for Haitians and that Haiti must retain its membership in the global Francophone community. Proponents of a more phonemic orthography call for Haiti to assert its independence from France and even the French language itself. There is also a more international dimension incited by the p honemicists' preference for letters such as and , which their opponents view as largely indexical of Anglophone culture and therefore encroaching upon the Francophone tradition, which favors and for the same respective sounds. The phonemic ists respond to that charge with reminders of the scientific merits behind the usage of and , perhaps failing to realize that technical elegance is valued quite differently by critics of the phonemic system (Sebba 2007). Returning to the German mis sionary origins of the earliest orthographies for Sranan, an important observation can be made. This thesis is intended to function as a resource primarily for linguists working in the field of developing orthographies for hitherto unwritten languages. S uch linguistically trained orthographers are most often motivated by a need or desire to spread literacy into marginalized societies and/or to fortify endangered languages against irrecoverable loss. The involvement of designers with such deliberately rel evant skill sets and motivations is, however, by no means a long standing tradition. In fact, it is a fairly recent phenomenon. In the past, aside from the few original geneses of writing itself, new orthographies have tended to arise from the habits of bilinguals in situations of sustained language contact, usually importing or at least drawing heavily from the pre existing orthographic customs of the superstratal language(s). For creoles, this is particularly likely due to the historic tendency to trea t them as mere dialects of their lexifiers, as has already been noted in the case of Haitian Creole. In Sranan, too, while


47 debates raged over the use of versus for the phoneme /u/, the former representation was only ever an option in the first pl ace due to the Dutch origins of the older contending code. In essence, a sort of tradition was established by linguistically unrefined innovators, and by the time someone with linguistic training entered the picture, a precedent had long been set that the pertinent speech community had internalized and woven into their collective heritage. The result is that, when the particular conventions in question are challenged, some speakers feel as if there is more at stake than a simple grapheme or two. Another potential result, at least if the distinctive features of the original superstratal language's orthography are retained, is that the nascent indigenous orthography may appear strikingly different from other systems in the same language family and/or geogra phic area. Perhaps the most intriguing example of this is Manx, a Celtic language spoken on the Isle of Mann in the Irish Sea. Manx is closely related to Scots and Irish Gaelic, but its orthography is very unique among such relatives. The first cohesive spelling system for Manx was principally invented by John Phillips, the Bishop of Sodor and Mann from 1605 until 1633. Along with at least one collaborator, he wrote a translation of the Book of Common Prayer into Manx, which was characterized by consona nts represented according to English conventions of the time and vowels spelled as they tend to be in Romance languages such as Italian or Spanish. Crucially, though, Phillips' code distinguished only those phonemic contrasts that English shared and was e quipped to discriminate while obscuring Manx distinctions that were superfluous in English. The subsequent bishop of the region, Thomas Wilson, adapted Phillips' system and used it in a subsequent translation of the Bible and a second rendition of the Bo ok of Common Prayer. The Manx edition of Principles and Duties of Christianity , a tome published in 1707, also used the Wilson orthography, which would survive mostly unchanged into the present. The system uses


48 several conventions that clearly imply Engl ish influence, including the use of and to spell the respective high vowels /u/ and /i/, a pattern that is ostentatiously unique to English and perhaps a few of its known historical substrates (Sebba 2007). A critical factor was the linguistic r epertoires of both the Manx people and their English clerical administrators. The ecclesiastical translations clearly could not have been the product of monolingual Manxmen. Many of the translators may not have even been native Manx speakers, and even th ose who were native to spoken Manx were probably not very accustomed to writing in their mother tongue. Instead, their primary literacy was likely in English. Furthermore, as in the case of the early German based Sranan orthography, this Manx orthography was initially intended as a special pronunciation guide for English speaking priests seeking to spread the Gospel more easily via the local language. In other words, it was not originally meant as a medium of primary literacy for Manxmen themselves. Tha t role was reserved for written English, just as standard French remained the final target language of literacy training in Haiti. Once again, the orthography's intended sphere(s) of usage and the educational background of its designers were instrumental in shaping the choice of particular graphemes and correspondences, which gave Manx its distinctive appearance relative to the orthographies of its closest relatives (Sebba 2007). That distinguishing look is in itself neutral with respect to the overall me rit and fitness of Manx spelling, but the resemblance or dissonance between the orthography in question and other orthographies can become important if it takes on special meaning for the speech community. If the Manxmen had ever declared their independen ce and subsequently sought to assert their own nationalism, written Manx may well have become one medium through which the new national pride was performed. In that case, one should not be surprised to find at least a significant party


49 within the Isle of Mann proposing an orthographic reform that would rid the national standard of some or all correspondences which are uniquely English and therefore index Anglophone power. One can easily imagine debates over the retention or ousting of for /i/ and for /u/ in the same ve in as those that actually did arise over the use of for /u/ in Sranan. A successful campaign for the overthrow of such politically charged mappings would mean a great victory for abstand , while a successful counter movement wo uld mean a triumph for ausbau (Sebba 2007). Particularly in the Sranan case, however, one of the contributing factors used to defend the more progressive orthography was the idea of internationalization . This can be thought of as a special brand of ausba u in which the model for emulation is not any specific language but rather a generic set of phoneme to grapheme mappings that derive a form of prestige from their greater commonality among languages that use the same script. For instance, was touted a s a suitable replacement for in the representation of /u/, because across languages that share the Latin alphabet, that is the most prevalent assignment. Similar pressures from international usage were influential in fostering a similar result in the joint Malay Indonesian orthographic disputes, which had also emerged from former Dutch colonialism (Sebba 2007). Since the international model is closely related to the IPA, linguists have naturally tended to prefer it in designing new orthographies for remote languages, which has increased the number of Roman script languages that use more or less the same essential sound to symbol assignments. This in turn creates even more international pressure for those correspondences, thus giving subsequent orthog raphic designers even more reason to use them. Arguably what arises, then, is a global positive feedback loop, probably reinforced more than ever by economic globalization. As we have seen, however, internationalizing ausbau is only one of multiple facto rs that can influence the fitness and ultimate success of a proposed orthography. Methods of measuring


50 a spelling system along functional linguistic dimensions have already been proposed. How now might we analyze the same system along socio cultural line s in order to arrive at a well rounded evaluation of its overall merit? While the relative precision of the tools introduced for assessing a system's technical efficiency may be difficult to match in the socio cultural realm, an important step is to propo se a set of key questions for the orthographer to ask about the speech community. The answers to these questions will inform the orthographic design process, including the degree of phonemic ambiguity that may be tolerated if the weight of any extra lingu istic factors renders absol ute predictability impractical. A Socio Cultural Questionnaire for Orthographic Design What follows below is a proposed superset of questions with potentially universal applicability from which any relevant subset may be used to elicit useful information in the development and implementation of a new spelling system, though it may be advised that the entire list be used in most cases, at least initially. Each question is then followed by a brief discussion of the specific points of impact that it may have in an orthographic design project. 1. With what (other) literate cultures has the speech community been in contact throughout its history, including the present time, and what is or was the nature of that contact? The histori cal examples described at length above demonstrate the importance of language contact in shaping the orthographic preferences and/or habits of any speech community. In contact situations, it is arguably inevitable that each participating language will tak e on some socio linguistic significance. In the previously examined examples, one can see how German, Dutch, French, and even English all contributed to the dispute over exactly what an "optimal" orthography should be. 2. Is the speech community current ly in an exographic relationship with another language in the region? In other words, does the language that a clear majority of the people speak most fluently differ from that which the same majority writes most fluently (Lüpke 2011)?


51 Exographia can have a substantial impact in the prominence of ausbau in popular discourse on what the new orthography should be like. As we saw particularly in the Haitian case b ut also in the Sranan case, the dominant written language is often the main model for spelling t he spoken language. There are two related reasons for this. First, the written language is liable to acquire significant prestige due to its likely role in administration and greater social mobility. Secondly, due in no small part to that same mobility, writing the spoken language may be treated as a mere means to the ultimate ends of literacy in the already written one. We saw this particularly in the case of Haitian Creole, for which pro etymologists argued that a more French like system would render the acquisition of standard French literacy easier. 3. What are the intended spheres of usage for the new orthography, and who will be its primary (or even sole) users? It can be easy for a Western linguist, raised in a culture that prizes near universa l literacy and almost takes it for granted, to think of the answer to this question as self evident. However, one need only be reminded that, throughout much of human history, literacy was relegated (where it existed at all) to a specific group or caste o f society. In medieval Europe, for instance, that group was predominantly the Catholic clergy. It would be naïve, then, to assume that no orthographer will ever encounter a culture seeking his/her aid in which writing will not necessarily be used in all spheres by all people. The sphere(s) of usage and the background(s) of the people with access to literacy can have rather substantial ramifications for orthographic design. The original intent of Manx orthography as a tool for native Anglophone clerics t o pronounce the local language and communicate more effectively with Manx speakers, for instance, had a profound impact on the correspondences and consequent look of the system. Of course, the Manx situation was an accident of history, but that does not n ecessarily preclude the utility of consciously designing a spelling system with a sufficiently narrow range of users in mind. If written language is limited


52 to addressing particular themes, for example, then perhaps a greater degree of surface ambiguity m ay be tolerated, since the predictable context can be enlisted to resolve dup licities. 4. Is literacy in the hitherto unwritten language itself the ultimate goal, or is it merely a bridge towards literacy in a separate regional prestige language? Altern atively, is there any non local language favored for emulation due to more general reasons of prestige and/or social mobility? This essentially asks if there is any drive for ausbau to consider and, if so, what is the relative weight of such consideration s. This is another question whose importance may not be readily apparent, and yet if the preceding look at Sranan and Haitian Creole are at all indicative, it is probably one of the most pivotal questions. It has clear and immediate ramifications for the importance of incorporating some resemblance to the prestige language's own orthography and indeed for the minimal extent of that similarity that will appease the speech community in question. 5. Conversely, is there any other language(s) to which a simi lar appearance is to be especially avoided in order to express the speech community's uniqueness and self determinism? This question is aimed at establishing the presence and relative strength of abstand within the speech community. The answer(s) to this question and question 4 will effectively decide where the fulcrum between ausbau and abstand lies, at least approximately. In both the Sranan and the Malay Indonesian case, as has been shown, one of the motivations for replacing with was to symb olically oust Dutch control after it had already been overthrown in a political sense. One more example of abstand proving somewhat decisive is in the reforms to American English spelling by the lexicographer Noah Webster in the early 19 th century. Motiv ated at least in part by post Revolutionary patriotism, some of Webster's changes have successfully become identifying characteristics of modern U.S. English orthography (Upward and Davidson 2011).


53 6. To what typographical and/or word processing technolo gies does the speech community have ready and voluntary access? This is a concern that has ironically been both initially instigated and increasingly mitigated by computers and other digital information technology. The advent of Unicode and the availabil ity of multiple software mappings of the same keyboard hardware has vastly improved the ease with which characters beyond the 26 letters of ASCII can be typed. Many keyboard layouts are pre packaged with most modern PCs and can be activated within minutes . There even exists free software that allows the creation of customized layouts. Still, there is no guarantee that the speech community in question will necessarily have access to the most up to date equipment, and even if the requisite technology is av ailable, members may be unable or unwilling to use it. This can have some immediate ramifications for the range of graphemes available to the linguist to use in uniquely representing the phonemes of the pertinent language. In turn, this can impact the ch oice of digraphs or diacritics as a mechanism by which the core 26 letter QWERTY roster is ultimately extended if there is a deficit of monographs. The culture in question may also have particular attachment to one or more traditionally significant charac ters that are not even in the extended Unicode set. Although Lüpke (2011) observes that restriction to Unicode characters may prove pointless in a manuscript culture that rarely if ever uses digital word processing, s he also warns against assuming that th is will always be the case and making a design decision that could render a later transition to computer mediated writing more difficult than it otherwise might have been. 7. Does the language have a prestige dialect of its own? How much flexibility do t he users want to represent different regional accents? While not a noticeable factor in the illustrative histories given above, this question might be important for self evident reasons, especially in diasporic cultures for which the influences of


54 various regional superstrata are likely to produce a wide range of dialectal variation. Besides elevating an existing dialect to standard status, ways to cope with dialectal variation include koiné synthesis and pandialectal underspecification . The latter refer s to marking only those contrasts that are common to all or most dialects. The for mer refers to the creation of a single artificial or semi artificial koiné variety that would presumably either blend the phonologies of most or all dialects or mark all con trasts made by the most phonologically rich variety (Lüpke 2011) 1 . 8. Is there a grassroots orthography already in use by a significant portion of the speech comnunity? If so, what is its nature? Another matter not directly revealed by the examples so far given but discussed at length by Jan Blommaert is the existence of grassroots literacy , which refers to the repertoire of reading and writing practices of those who are only partially if at all competent (or inclined to participate) in a deliberately a nd centrally regulated writing system (Blommaert 2004). The coinage grassroots orthography then refers to any cohesive set of such practices that most often coalesces f rom the collective habits of mu l t iple individuals who semi consciously synthesize a ran ge of consensual norms due to the demands of effective communication. Anything resembling regulation emerges bottom up as some habits spread widely enough to become quasi conventions while others either remain ideolectal or actually fade from use even amo ng their inventors. As such, any grassroots orthography could well be classified as a form of peripheral normativity , which, to paraphrase Blommaert, is a set of norms or quasi norms developed and used by people outside of the more rigid, centralized, and socially mobile normativity that might otherwise be applied. It may also be analyzed as a somewhat stabilized and diffuse sort of what Blommaert calls "heterography" 1 The raw concepts are largely Lüpke's, while the exact terms used to refer to them are my own coinages.


55 (Blommaert 2006). Text messaging lingo, including its abbreviations, can be thought of as a good example of a grassroots orthography. One more useful way to conceptualize grassroots orthographies may be as unregulated or loosely regulated "spaces" described by Sebba (2007). Howev er one understands them, though , they could be said to confir m Lillis' (2013) argument that written language need not be purely the domain of rigid codification an d affectation. Measuring Proximity to a Model Orthography A prominent thread throughout the discussion of historical examples and the key issues revealed by those illustrations is the nascent orthography's overall proximity to or distance from the established orthography of a language that, for one reason or another, the speech community seeks to emulate. Being able to place a provisional orthographic sys tem, at least approximately, on the ausbau abstand spectrum would therefore be a great stride, in many and maybe even most cases, towards assessing the socio cultural fitness of the proposed code. In order to do this, we must devise a methodology for meas uring the similarity between two or more orthographies. Here may lie another potential application of the formalism introduced in Secton 2. If each orthography to be compared is fully described in terms of correspondences and rules, then the comparison b ecomes relatively straightforward. First, one lists in the decoding direction all correspondences for the language with the fewest phonemes. Next, those correspondences from the other language that decode the same graphemes are similarly enumerated. The n, the number of correspondences that match is divided by the total number of correspondences, and the result is expressed as a percentage. A match between homographemic correspondences is defined as occurring when at least one pair of referenced phonemes from each code can be found such that either one is identical to the other or one is the host language's nearest articulatory analog to the other (henceforth described as "maximally analogous"). If, for instance, the model system has = /f/ while the proposed system has = / /, provided that the host language of the latter


56 lacks a separate /f/, these two correspondences may be said to match. One need not necessarily start from a decoding perspective. The above methodology can also be used from the encoding perspective, in which case one would count the matches between homophonemic correspondences, or correspondences that encode the same or at least maximally analogous phonemes. In either case, we can also refine the resulting estimate of raw proxi mity by incorporating the positional rules of each code to arrive at an estimate of adjusted proximity . A match betwe en positional rules can be similarly defined as relying on the same or maximally analogous trigger conditions, applying to a sound or symb ol whose raw correspondences already match, and having the same or maximally analogous elsewhere options. Nonetheless, both raw and adjusted proximity are assessments of what one might call functional proximity in order to distinguish it from etymological proximity . The latter metric is relevant mainly to creoles or other languages for which the bulk of the vocabulary is borrowed from a superstratal language. In the case of Haitian Creole, we observed a certain intermediate political position between the pro phonemicists and pro etymologists. Advocates in that middle camp wanted words to be spelled according to actual Haitian pronunciation but with distinctly French correspondences. In other words, they wanted a system that would evoke proper or at leas t recognizable Haitian Creole pronunciations if read aloud by someone literate in standard French, even if that French reader would hardly understand a word he/she was saying. This is much like how the Manx orthography would be read aloud by a native Engl ish speaker. The degree to which such an aim is accomplished is what functional proximity measures. The pro etymologists, on the other hand, wanted each word to resemble its semantic French equivalent regardless of Haitian pronunciation. In other words, they wanted a code that a native French speaker could better understand, even if his/her pronunciation would be less accurate. That is


57 etymological proximity, closely related to archaeography , which differs only in that the language in question is not a creole and bases many or all of its etymological spellings on formerly more phonemic renderings from its own independent ancestor. Two of the most historically influential languages in the world, French and English, exhibit high degrees of archaeography. In fact, according to Upward and Davidson (2011), English is perhaps characterized by an unusually high incidence of both archaeography and etymological proximity, particularly to Latin, Norman French, and classical Greek etymons. Samuel Johnson, the fir st major English lexicographer, declared outright the reluctance to tamper with traditional forms in his seminal 1755 dictionary, expressing a view that still resonate s with many modern Anglophones. When a question of orthography is dubious, that practice has, in my opinion, a claim to preference, which preserves the greatest number of radical letters, or seems most to comply with the general custom of our language. But the chief rule which I propose to follow is, to make no innovation, without a reason sufficient to balance the inconvenience of change; and such reasons I do not expect often to find. All change is of itself an evil, which ought not to be hazarded but for evident advant Davidson 2011) Furthermore, the historical traject ory of Greek itself is perhaps one of the most illustrative examples of archaeography and particularly its repercussions on graphemicity. In classical Greek, the graphemes < >, < >, < > < >, and < > all represented different phonemes (Hansen & Quinn 1992), but in modern Greek, all of those originally distinct phonemes have merged into /i/. Nonetheless, the orthographic differentiation remains (Matsukas 2010), thereby const ituting a substantial detraction in graphemicity for the contemporary system known as dimotiki . In fact, until the middle of the 20 th century, standard written Greek usually came in the form of the even more archaic katharevousa , and the transition was qu ite tumultuous. Although the archaism of katharevousa extended to an entire register of Greek, encompassing grammatical and lexical as


58 well as phonological atavisms (Jahr 1993), what both the Greek and English cases reveal about the potential potency of l ong term nostalgia in the planning of any standard written language is still worth noting. The push towards etymological proximity exemplified by Haitian Creole can be thought of as a form of such conservatism, despite the fact that it was a far younger l anguage and thus had no direct ancestor independent from French to which the orthography could hark back. Most of what has been said and will yet be said here is potentially applicable not only to the development of entirely new written codes, but also to the renovation of long established literacies, where such nostalgic sentiments are more likely to play a greater role. Hence, the possibility of archaeographic impulses, which often spring from a sense of shared historical identity (i.e. abstand via cult ural legacy), remains relevant to t he broader goal of this work.


59 CHAPTER 6 CASE STUDIES Dschang Dschang is a Bantu language spoken in sub Saharan Africa. Like most members of the Niger Congo superfamily, Dschang has phonemic tone. Tonal contrasts serve both lexical and grammatical roles. While a typical Niger Congo roster of basic phonemic tones (or "tonemes") consists of just high, mid, and low, a robust system of tone sandhi (morphotonemic alternation up to a phrasal level) generates c ontour, upstepped, and downstepped pitches as well. In some cases, it even goes so far as terracing , in which an upstep or downstep cascades throughout all subsequent syllables within a word or phrase. As a brief example, one can take the word l , in which a contour rising tone follows a level low tone, at least when the word is pronounced in isolation. If it is in a position of following a high tone, however, the melody becomes l t | , in which the initial low tone changes into a high t one and the rising tone changes into an upstepped high tone (Bird 1999b). Given Dschang tone's role in making both lexical and grammatical contrasts, a linguist may instinctively conclude that orthographic tone marking should be based on surface or near s urface melodies. Bird (1999b) calls that initially obvious ideal into question quite clearly. He first observes the relative shortage of real experimental data on the reading fluency enabled by various tone marking strategies and provides a survey of what experimental materials he was able to find. Although the methodologies of these studies preclude any rigorous generalization, they all suggest that surface level tone marking may perform more poorly than other models, at least for certain languages depen ding on the particular role (i.e. morphophonological depth) of tonal contrasts therein. The languages analyzed in these experiments were mostly or all related


60 African tonal languages, and the tone marking paradigms under scrutiny ranged from eschewing ton e marks altogether to the ubiquitous representation of surface (i.e. post sandhi) melodies. For instance, one strategy was to only mark the tone that was least susceptible to morphophonemic alternation, while another was to mark tone only when it resolved a potential grammatical or lexical ambiguity (Bird 1999b). Bird then proceeds to describe the methods and findings of his own experiment with Dschang orthography. The tonology of Dschang can be described "with some abstractness" as being comprised of th ree basic tones: high, low, and floating (often called "mid"). The existence of downstep then yields a total of six potential distinctions, with a seventh arising at the juncture between a verb and its object. In fact, the complexity of the Dschang tone system makes it rather unique among tone languages in general. There have been various orthographies for it since the 1920s, but the one still in use today was first introduced in the 1980s and consistently represents surface tone. The tone diacritic den sity is unusually high at 58%. A brief sample of a Dschang text is provided below. High tone is marked with an acute accent, while mid/floating tone is marked with a macron. Low tone is indicated vi a the absence of any diacritic. ta' en . P esh ' am (Bird 1999b) Largely because downstepped high tones are treated like mid tones and the lowest tone that can be distinguished is a non downstepped low tone, terracing in Dschang can sometimes render certain tonal contrasts irrepresentable by this orthography. For example, the verb roots ("to fry") and ("to choose") contrast both phonologically and orthographically in their isolated forms. In the first person singular present progressive, however, tonological alternations create a double downstep that twice lowers the vocalic tone in the root. For both, the low tone of


61 the preceding downstepped ma rker spreads to the word initial nasal. The second high tone in becomes mid and then low, and the original mid tone in becomes low and then doubly low. Since a lower than low tone cannot be orthographically distinguished from a simply low to ne, both "I am frying" and "I am choosing" are therefore ultimately written as . Introducing an additional diacritic, however, would not entirely solve the problem, since there would still be no way to distinguish an underlying low tone from a doubly downstepped high tone like the one in , the first domino in this ambiguity generating terrace (Bird 1999a). In evaluating the current Dschang spelling system, Bird had four objectives. First, the assumed advantage of any tone marking over no ne at all was to be verified. Second, its fitness for novice versus experienced readers was to be compared. While beginners are likely to lack refinement in their use of context to clarify potential ambiguities and might therefore depend more on tone dia critics, that same dependency may have solidified into a habit for advanced readers rather than necessarily giving way to more potent contextual disambiguation abilities. The third goal was to identify specific constructions that lent themselves to ambigu ities that impeded fluent reading in the current orthography and in a hypothetical unmarked system. Finally, the fourth objective was to examine participants' productive command of the tone marking conventions (Bird 1999b). Sixteen native speakers of Dsc hang were chosen of various ages, educational attainment, and levels of literacy. Five were ultimately discarded due to their reading fluency being so slow that they would only ever parse the isolated form of each word, which would be detrimental to an ex periment in which sandhi effects were so important. Twenty texts of similar style, length, and difficulty averaging 200 words long were compiled and rendered in two versions each, one with and one without tone marking. These texts were then arranged rand omly into two booklets of


62 four texts each, such that in each one, two of the texts showed tone marking while the other two lacked it, but the choice of which specific texts were tone marked differed between the two mini anthologies. Subjects were then pre sented with either one or the other booklet and asked to read each text aloud. They were then asked to add tone marking to the originally bare passages. An initial study showed that the self paced addition of tone marks was problematic. Some worked very slowly due to a focus on accuracy over efficiency, and others ultimately failed to complete the task due to fatigue. A 20 minute time limit was thus imposed on the tone marking portion, with acknowledgement that not all subjects would finish (Bird 1999b) The recorded readings were analyzed for speed, fluency, and accuracy. Speed was estimated as the average number of seconds taken to read 100 words, every repeat of a syllable, word, or phrases was counted as a disfluency, and every incorrect tone was al so counted against accuracy. A na tive speaker categorized the in a c curacies as comprehension errors, in which the unintended tones produced different but still plausible interpretations, and performance errors, where the mistakes produced ungrammatical or anomalous interpretations. Non tonal errors were not counted. Each participant's age, gender, level of edu ca tion, degree of exposure to the conventional literacy training, and self confidence in his/her mastery of the orthography were recorded. Cluster analysis was used to correct for individual variation (Bird 1999b) The results of the reading task revealed a significant decrease in speed when deciphering the standard tone marked orthography in comparison to the unmarked variant. There was also a decr ease in fluency, though it did not quite reach statistical significance. The analysis of the tone errors yielded rather unexpected results, with the unmarked orthography inciting no grammatical tone errors and only three lexical tone errors while the tone marked texts elicited over ten of each error type. Meanwhile, the results of the productive tone writing task yielded some even more


63 surprising results. For the purposes of this analysis, subjects were grouped into two categories: experienced and inexpe rienced. At 83.5% overall and 73.1% relative to the presumed result of random placement, the average accuracy of the former group was surprisingly low, considering that it included Dschang speakers who "probably control the orthography better than any oth er speakers of the language." Moreover, those in the latter group actually averaged worse than sheer chance at placing high tones. Three more main observations were made. First, especially low overall scores for both groups in placing the floating tones may be due to the instability of that tone. If instability is defined as the probability that the isolation form tone on any syllable will remain the same in a phrasal context, then a comparison of the original tone marked texts and versions that mark to ne only according to isolated forms suggests that the floating tone is indeed the least stable. Secondly, the comparably high number of marks inserted for floating tone, both correct and incorrect, suggests that that tone is a sort of default case to whic h the subjects tended to retreat when sufficiently uncertain. Thirdly, an analysis of the errors by inexperienced readers reveals that they are not basing their tone marking on how each word would be pronounced in isolation, as one might expect (Bird 1999 b). The final part of this experiment consisted of a six item questionnaire intended to assess the speakers' everyday usage of tone marking as well as their personal opinions of it. The results of this question naire are consistent with those of the readi ng and writing tasks. More than half of the participants reported that they do not fully mark tone in personal correspondences and would support a reduction in tone marking. In addition, Bird's work also shows that accuracy in writing tone diacritics is an area in which speakers tend to overestimate their skills. The more confident subjects read faster but less accurately, and while it seems that Dschang literates might benefit from some reduction in tone marking, the complete elimination of it is not su pported. Current


64 Dschang orthography is based on the analysis of modern professional linguists who presumably identified those tonemes that are structurally contrastive and therefore important to distinguish in writing (Bird 1999b). This is a clear case of phonological merit not necessarily guaranteeing the optimality of the system for practical use, but why then would it not? Bird proposes that the answer lies in the morphophonological depth of Dschang tonology. The same morpheme can take on several d ifferent tonal melodies depending on inflection and/or derivation as well as phrase level sandhi. If the orthography represents surface or near surface forms, then this will naturally lead to comparable variation in the orthographic realization of the sam e morpheme. In other words, the average tonemic proximity of a morpheme's surface form to its underlying form is often not close enough to allow the former to be the most suitable basis for spelling. The fact that this rich morphotonology is also active at the syntactic as well as the morpholexical level is especially important, because it prevents the establishment of a consistent word image , or holistic visual shape of a word, for any given lexeme. Invariable word images are arguably what allows advanc ed readers to parse text more efficiently than those who persist in "sounding out" each and every word anew regardless of familiarity (Bird 1999b) . However, as has been said, Dschang is an oddball among tonal languages in terms of its morphotonological co mplexity, and that is why he warns against generalizing his results for all tonal orthographies. In a tonal language with a much simpler and more transparent relationship between underlying forms and morphophonemic surface forms, an orthography like the c urrent Dshang standard might serve its users much better. The spelling system's deference to surface forms itself is not the problem. It is instead the mismatch between that shallowness and the significantly greater depth of the host language's morphoton ology. An important process in designing a tonal orthography, then, is assessing the pertinent language's morphophonological


65 depth and tailoring the orthography to have compatible depth. This does not necessarily mean that the deepest underlying form sho uld always be the basis for spelling, but it does suggest, at least, that such frequent gaps as wide as those in many Dschang lexemes between the ultimate deep form and its written rendition should probably be avoided (Bird 1999b). In fact, in another sim ilarly themed paper, Bird suggests that this notion of language specific matching of orthography with tonology, revealed above in terms of morphotonological depth, can be generalized to the overall role of tone in the phonology of the language in question. It is not enough to determine which tones are phonemically contrastive in the broadest sense and assume that all such tonemes must be marked at every single instance. The functional load and nature of tonal contrasts in the language can have a significa nt impact on the choice of optimal spelling strategies. For example, in languages that have tone spreading across word boundaries, one especially apt approach might be to mark tone only when it is different from the preceding tone. This would be consiste nt with the hypothesis that changes in tone are more salient for native speakers than the actual pitches. In some other languages, perhaps those with a sufficiently limited inventory of possible word melodies, a single diacritic at or near the beginning o f a word could be used to mark the melody of the entire word rather than the pitch of its host syllable. Occasionally, even non tonal features of the host language's phonology can be relevant to tone marking, as when the absence of vowel length contrasts allows the possibility of doubling vowel graphemes as a means to represent certain tones or at least relieve some of the functional load (and therefore frequency) of diacritics. Contour tones may be especially apt for this sort of strategy (Bird 1999a). W olof Wolof is another sub Saharan language, spoken in Senegal as well as various portions of neighboring West African states, with a rather illustrative orthographic history that extends well


66 into the present day. By the 11 th century, the Berber Almoravi ds, had superseded the influence of the Empire of Ghana and converted many in the region to Islam. In the 13 th and 14 th centuries, the old Sene galese kingdoms came under the rule of the Mali Empire, which was administered at its height by literate Muslims . The Islamic dominion facilitated robust trade in gold, slaves, and other commodities, but it also established a Muslim theocracy headquartered in the northeastern city of Tekrur. This spread of Islam and its associated prestige meant similar importance for the Arabic script, which has remained influential in the orthographic landscape of Senegal and the Wolof language. European contact began with Portuguese and Dutch traders seeking routes into West Africa in the 15 th century, and with them presumably came Christianity as well as the Latin script. The French followed in the 17 th century, establishing several commercial outposts along the Senegal River and the nearby coastline. These trade hubs were mostly centered around large Wolof speaking settlemen ts, and the resulting rise in urbanization is what catapulted Wolof to prominence as a regional lingua franca . Wolofization received a further boost in 1885 when those same commercial centers were declared communes of a French colonial federation and thei r citizens were afforded special status compared to rural dwellers (Evers 2011). Meanwhile, more and more Quranic schools called daara were founded as Sufi clerics joined in the demographic shift from rural to urban. Also, in a process dating back to the earliest influx of Islam into the area, Muslim missionaries appropriated native animistic use of inscribed talismans and syncretized similar ritual objects inscribed with Quranic verses, thus adding to the association of mysticism and sanctity with the Ar abic script. The preferred writing system in the daara became wolofal , a Wolof variant of ajami , which broadly refers to the writing of non Arabic languages in the Arabic script. By the 19 th century, this writing system had become central in the Qurani c school system and in communicating the biographies of the brotherhood


67 leaders who administered them to non Arabic speaking disciples, and by the early 20 th century, wolofal was also shown to be suitable for more creative purposes, such as the religious s ongs known as qasayid (Evers 2011) . As for European influence, it was not until 1817 that a school in the Western tradition was founded by the Frenchman Jean Dard, who married an indigenous wife and had a family in the city of Saint Louis. Dard eventuall y acquired enough Wolof to publish two dictionaries and a grammar of the language, and the Latin orthography that he developed served as the basis for much subsequent work by other scholars. His remarks on the Romanization of Wolof and the writing of gram mars demonstrate either ignorance or at least dismissal of the already existing wolofal tradition, as he seemed under the impression that Wolof had until then been purely an oral language and culture. Those same commentaries, however, also asked the reade r to forgive imperfections in his transcriptions, perhaps suggesting that perfect spelling in Wolof was deemed irrelevant to the ultimate goal of French education, which at the time did not include acquisition of a full fledged and standardized written reg ister of Wolof on par with French literacy (Evers 2011). Senegal won its independence from France in 1960. Its first president, Léopold Sédar Senghor, was a very educated and qualified Francophone linguist who supported the notion of a "new francophonie " fostering positive change for Senegal. Although he was a founding father of the African nationalist movement La Négritude , he was at the same time "strongly Francophone in orientation to progress." His policy favored the continued use of French as the a dministrative and academic language of the nation, which he justified with the belief that French and Wolof were each best suited for largely different discourses. He viewed Wolof as more emotional and uniquely expressive of the "Black soul" and French as more rational and vigrous in its ability to


68 express advanced philosophical, scientific, or otherwise scholastic ideas. According to Senghor, the gap in Wolof's lexical capacity to genuinely meet the expressive demands of modernity and Western style acade mia would take a century or more of development to close. In the interim, then, Senghor argued that French should retain its inherited position as the country's official language (Evers 2011). In 1963, the Ford Foundation funded the establishment of the Centre de Linguistique de Dakar (CLAD), which was tasked with reconciling France's classical model of education with its West African context. The CLAD's solution was heavily influenced by an ideology of functional bilingualism in which "different langua ges could be unequal across domains but complementary in terms of identity. The CLAD was later incorporated into the University of Dakar in 1966, but during the last half of the decade, increasing pressure was put on the Senegalese government to better de mocratize education. In 1968, a succession of strikes by academics finally compelled Senghor to elevate six indigenous Senegalese languages, chief among them Wolof, to national (but not official) status in a series of decrees. These decrees continued unt il 1979, progressively codifying and standardizing the written forms of the six nationalized languages. Parallel to these laws, however, there was also a parallel series aimed at protecting French, legistlating the correct transcription of French borrowin gs into the indigenous languages and vice versa (Evers 2011). Meanwhile, two competing ideologies emerged around usage of the 1968 standard Wolof orthography in Roman letters, which was itself largely derived from a syllabary developed by the Féderation d es Étudiantes d'Afrique Noire en France in 1959. The first, led by Cheik Anta Diop and his followers, sought to demonstrate the suitability of contemporary Wolof as a medium for the expression and transmission of the very same advanced and nuanced topics of modernity that Senghor had claimed could only be adequately expressed in French. To that end, Diop himself


69 translated several diverse scientific works into Wolof, such as Einstein's theory of relativity. On the other side were the proponents of "infor mal education," a movement started in 1961 towards educational curricula that involved more hands on learning and knowledge tailored for practical application in the local context, rather than the cosmopolitanism and abstractness of the Western model. Whi le both parties used the standard Latin orthography for Wolof and gave it a central role in their respective agendas, they were using that code for very different goals. The schism grew to such an extent that, when a Sen e galese NGO and the Ministry of Edu cation joined forces in 1978 for an experimental implementation of informal education, the Diopistes largely refused to cooperate, thus contributing to a period of relative educational stagnation from 1975 to 1980 (Evers 2011). A separate dispute arose be tween the Diopistes and Senghor in 1979 when the latter censored a film entitled Ceddo for its non standard use of consonant gemination. The former party responded by defiantly publishing two new journals whose titles contained ostentatiously doubled cons onants. Senghor supposedly justified his position with the 1856 grammar of Wolof by l'Abbé Boilat, who dismissed the idea of contrastive gemination despite acknowledging vowel length contrasts. Instead, Boilat suggested that word final geminate consonant s be interpreted as short consonants followed by a schwa. In Boilat's more French derived system, one would have spelled the movie title , using a
for the voiceless palatal stop represented in the standard system with . Ironically, in spi te of Senghor's support of the official 1968 CLAD orthography, which was deliberately distant from French patterns, his objecti on to geminated consonants seem s to be rooted in deference to an older and far more French like code (Evers 2011). Amidst the co ntention over the usage of French versus Wolof in schools and the specifics


70 of a Latin script orthography for Wolof, it was perhaps as easy for contemporary activists as it may be for many scholars today to forget that Roman letters were not the only optio n circulating at that time, nor are they now. Aside from the parallel ajami system used for Wolof in Quranic schools across Senegal, which we will shortly revisit, there was also a much more unique script designed in 1961 by Assane Faye, member of Instruc tors for the Teaching of African Languages. This Garay script was inspired by both Arabic letters and ancient Egyptian hieroglyphs. It was intended to express a pan African identity, and Faye claimed that Garay could be utilized across Africa regardless of ethnicity for multiple languages. Despite the commendable bid for abstand , it has not gained any significant following. When it is shown to them, today's Senegalese youth typically marvel at its very existence and/or reject it because of its rather bi zarre appearance in comparison to more familiar scripts (Evers 2011). For its part, while the Romanized orthography for Wolof has come to encompass a wide range of usage spheres in public life, such that there is little or nothing of which it is particula rly indexical, Wolof ajami has, if anything, become even more narrowly associated with Islam and the daara than it was in its infancy. In modern Senegal, wolofal is especially associated with the Mourides Islamic brotherhood, a Sufi order founded in 1883 by Seeriñ Cheikh Amadou Bamba. Bamba was a spiritual reformer who laid the foundations for Touba, an autonomous city state within Senegal that successfully offers education, health care, and basic sanitation without any participation from the Senegalese g overnment. There are no French medium schools in Touba. Instead, Quranic schools administered by the Mourides dominate. While Arabic remains the language in which the Quran is taught, Wolof is a primary medium of instruction, and unlike other areas of S enegal, Touba hosts a variety of religious poems by Bamba and others that are widely circulated. In the absence of any publishing house officially producing Wolofal works,


71 several printers and booksellers frequently offer photocopies of these popular writ ings. There are even two Touban television channels that regularly broadcast content in Wolofal (Lüpke and Bao Diop 2014). In fact, the Mourides have fostered an unusually robust corpus of religious and poetic literature in Wolof throughout all of Senega l, most of which lies in various private collections. Within Touba, however, the prestige that religion has brought to wolofal has led to its use in spheres that are elsewhere reserved either for the Latin script or for the French language, and across mos t of Senegal but especially in Touba, wolofal is also commonly utilized in personal correspondence as well as public signage. Still, the strong association with religion remains. That association has real ramifications with respect to enrollment rates in secular state schools, which are quite low in comparison to attendance in the daara . The preference for the latter may be explai ned by the very different episte mological philosophy used by these Quranic schools as opposed to the Western style curriculum offered by French medium education. The daara are steeped in an epistemology focused on the continuity of tradition and the inculcation of fixed truths and values. Western tradition, on the other hand, functions on an epistemology of social mobility, cri tical exploration of multiple paradigms, and intellectual dynamism. The former is often much more appealing to Senegalese parents, who wish to pass on their traditions to their children. Failure to realize and cope with this ideological difference is lik ely what perpetuates the paradoxical prominence of wolofal despite its official subordination and lack of prestige compared to French and standard Latin script Wolof, even as the latter fosters much greater social mobility (Lüpke and Bao Diop 2014). Despi te the official recognition of wolofal in 2002 and subsequent attempts to standardize it, Wolof ajami remains largely uncodified. Like all African ajami scripts, it is based mainly on


72 Maghreb i Arabic and influenced by the c lassical Arabic of the Quran. I t sometimes differs from Modern Standard Arabic in the presence and placement of certain distinguishing dots. The letter , for instance, may be written with a dot below ( ) rather than above ( ) the main portion of the glyph, while ( ) may be wri tten without any dot at all ( ) in word final position. Plus, while vowel diacritics are used only for training purposes in the acquisition of standard Arabic literacy, wolofal uses them in most or all texts, essentially transforming the semi abjad into a n abudgida. This may be unsurprising given that, even with the regular use of diacritics, Wolof still has over twice times as many vowel phonemes as Arabic does, which leads to a substantial grapheme deficit. Even the consonant rosters of the two languag es do not transliterate very well, particularly with Wolof's prenasalized stops. A number of strategies are employed in resolving such problems. An especially interesting technique that was emerged is a diacritic consisting of three dots placed above the host Arabic character. It has no single phonological interpretation, signaling only that the letter should be given a distinctively Wolof rather than a native Arabic pronunciation. The particular phoneme referenced depends upon context and/or phonologic al similarity to the one invoked by the host Arabic letter (Lüpke and Bao Diop 2014). In a thorough study, Bao Diop explores in greater depth the grassroots strategies used to adapt the Arabic script to Wolof phonology and finds five basic techniques. Fi rst, those Wolof phonemes with maximally analogous counterparts in Arabic are consistently symbolized by the appropriate Arabic characters. That series includes /b/, /t/, /d/, /j/, /k/, /f/, /s/, /r/, /l/, /x/, /q/, /m/, /n/, /w/, and /j/. A second strat egy is homography, by which the same Arabic character may be invoked to reference multiple Wolof phonemes, frequently with the general Wolofizing diacritic described above. Perhaps most intriguingly, typical clusters of phonemes that are conflated into a s ingle grapheme are not haphazard. Instead, they tend to reveal sensitivity to phonotactics on the


73 part of the ajami speller. For example, the phonemes /c/, / /, / place of articulation, but also differ somewhat in t heir possible distributions. They can all occur in word medial position, but /c/ at least cannot be final. All of them are usually represented by jim ( ). Similarly, /b/, /p/, / b/, and / p/ are often merged into the single Arabic letter ( ), and wh ile any of them can occur word medially, / p/ is phonotactically prohibited from initial position, while /b/ always devoices in final position. Overlapping with this practice is a tendency to represent prenasalized voiced stops, which Arabic phonology lac ks completely, identically to their homorganic simple stops, not preceded by glyphs for the corresponding simple nasals as in Roman script renditions. Bao Diop suggests that this choice might reflect some intuition on the part of the writer that these sou nds are phonologically unitary, an intuition which oddly does not extend to prenasalized voiceless stops, which tend to be written digraphically (Bao Diop 2011). A third attested scenario is the rendering of the same Wolof phoneme by different Arabic symb ols in different instances. The phoneme /x/, for example, may be variously spelled as < > or < >. A fourth class is the adoption of Quranic forms instead of Classical or Modern Standard forms, as in the aforementioned example of dot placement on the lett er for /f/ ( ) as well as the /q/, for which wolofal uses < >. Finally, a fifth mechanism addresses the vowel deficit. Wolof has 16 vowels (nine long and seven short), while Arabic has only six (three long and three short). This is resolved by conflati on, which makes it arguably a subtype of the first adaptation strategy. The vowels / /, / /, / Diop 2011). In the presumable absence of any positional rules, a tabulation of this system would yield a very low phonemicity rating. These features are all common but not standardized in their usage, an d in a separate


74 essay, Bao Diop has suggested along with Lüpke (2011) that this is part of wolofal 's broad popular appeal. Recent attempts to standardize ajami have tended to ignore regional varieties and have not been adopted by many wolofal writers. Th e strength of the code is that, as a mere by product of learning to read and write Arabic rather than an intentional target of literacy in its own right, ajami becomes immediately accessible to anyone who has sufficiently progressed in Quranic education, w ithout the need for any extra training in a separate set of rigid norms. The corpus that Diop examined in her original study consisted of religious literature that also happened to contain parallel transcriptions into a grassroots Roman script orthography that differs considerably from the official standard. The former is significantly less phonemic, since it obscures the ATR contrast in the mid vowels. Several consonants are also spelled differently, as shown in t he following comparative table Table 6 1 . Comparison of Consonants in French Based versus Standard Wolof Orthography Phoneme French Based Official Standard /x/ /q/ /c/
/ /s/ All o f these divergences from the Roman script standard as well as the use of for the vowel Diop compared the more Gallic orthography attested in her corpus to w olofal via a typographic efficiency analysis 1 in which the average cost of each orthography in keystrokes was measured. Capital and diacritic bearing letters were treated as two keystrokes each. As an 1 While the experiment and analysis was all Bao Diop's, this term is my own coinag e.


75 example from standard French, contains three se gments and takes three keystrokes to type, so 3/3 = cost of 1. The capitalized form, , however, requires four keystrokes so, the cost is 4/3 = 1.333. Meanwhile, the six letter word The average cost per word in wolofal was thus calculated to be 1.17, while the French inspired system yielded an average cost per word of 1.06. This should pique our curiosity as to how the standard Latin script orthography would fare in a similar study (Bao Diop 2011). Despite its ambiguity with respect to a few consonants and especially the vowels of Wolof, the grassroots system that uses French based patterns trumps the standard Romanization in i ts popularity. In fact, McLaughlin, as cited by Lüpke and Bao Diop, confirms that the official Wolof orthography is very seldom used by those other than linguists, educators, and a precious few Wolof writers. Literacy in the Franco Wolof orthography pres upposes and is a by product of literacy in French itself, and those who are thus educated typically write in either French or wolofal . Roman script Wolof adhering to standardized conventions is the least used written language of all in Senegal (Lüpke and Bao Diop 2011). From a strictly linguistic perspective, the official Latin orthography for Wolof is ideal. The raw correspondences are isomorphic, so no positional rules appear to be necessary Nearly every consonant phoneme is represented either by the appropriate IPA grapheme or by the most visually similar glyph available on an unaugmented QWERTY keyboard (cf. for < >). All sixteen vowels are also encoded in a similar fashion. Length contrast is conveyed through letter doubling, while the ATR ten se/lax contrast among the mid vowels is indicated by using an acute accent on the tense vowels. Long tense vowels only bear acute diacritics on the first iteration of the relevant symbol (e.g. <ée> rather than <éé>for /e:/) (Fal 1991). This essentially e xploits the phonotactic illicitness of sequences such as /e /, a very natural but also very useful prohibition in


76 this case. Nevertheless, we find in Senegal another linguistically optimal orthography that fails to compete adequately against a less techni cally precise system for socio cultural and/or pragmatic reasons. In fact, while the historical examples examined in the previous chapter involved only one main target of ausbau per case, standard Romanized Wolof could be said to exist amidst great tensio n between two different contemporaneous ausbau models. If not the subjects of deliberate emulation, Arabic and French are at least the foundational literacies through which two unofficial Wolof orthographies are incidentally acquired. Moreover, with resp ect to French specifically, the most prominent politician in Wolof orthographic development, Senghor, seems to have somewhat embodied the conflict between ausbau and abstand , and that internal struggle manifested itself in his policies. The key failure of the standard Wolof orthography is arguably that its design completely eschews not one but both of the more respectable literacies. In other words, when the consideration of prestige languages is perhaps at its most important given the multiplicity of sup erstrata, ausbau was instead disregarded most utterly. Furthermore, even between wolofal and Franco Wolof, one is somewhat limited in its respectability by its strong indexical ties to Islam, Mouridism, and the daara , though it still fares much better t han the official orthograp h y in terms of diffusion. It may seem that a pragmatic move by the Senegalese government might be to standardize and officiate wolofal , but the lack of success at standardization has already been noted. Its next best option, if it wishes to better harmonize with the people and perhaps improve literacy rates by doing so might be to recognize and standardize the Franco Wolof orthography. Writers may be more willing to write in Roman script Wolof if their pre existing knowledge of written French meant a gentler learning curve. In any case, the orthographic landscape of Senegal remains a fascinating situation that reveals the


77 persistence of traditional grassroots literacies even in the presence of a newer, professionally designed st andard ortho graphy with government backing. Mauritian Kreol Mauritian Kreol is a French based creole spoken on the Pacific island of Mauritius by a population mostly descended from French, Dutch, and English settlers as well as many African slaves. Moder n Mauritius is an ethnically diverse nation in which French has been the traditional prestige language, English is also increasingly important, and Kreol remains relegated largely to the oral domain. Written Kreol, marginal though it may be, dates back to the 18 th century, but its first significant promotion came in 1888, when the French linguist and folklorist Charles Baissac published a collection of folk tales in Kreol. The Baissac orthography was approximately half etymological and half phonological, seeking to "show both derivation and pronunciation" by maintaining clear links to its French source. Nevertheless, Kreol persisted throughout the 19 th century with no single commonly used orthography, and even now, many Mauritians are under the impression that written Kreol simply does not exist, despite the availability of dictionaries (Rajah Carrim 2008). The various orthographies that have been proposed for Kreol can be placed on a spectrum from those favoring high etymological proximity to French and those favoring high phonemicity. The placement of any particular system on that continuum tends to strongly incite and/or reflect different ideological stances on the language itself. Largely etymological systems, for instance, reinforce the traditional perception of creoles as highly dependent and arguably "broken" dialects of their respective lexifiers, while more phonemic systems index nationalist independence. The first example of the latter type was the grafi riptir ("orthography of rupture") devel oped in 1967 by Dez Virahsawmy, an ex politician with linguistic training. He sought to "rupture" Kreol from French and legitimize it as an independent language in its own right. His proposed code was, as


78 a result, highly phonemic and divergent from the French standard. As a consequence, however, the system rendered the transition between Kreol literacy and French literacy too difficult to gain a following, and it was never widely accepted. A few years later, Philip Baker proposed another phonemic ortho graphy in his 1972 Kreol grammar. One unique feature of this code was the use of as a vocalic nasalization marker, which was later replaced by or in Baker's joint 1987 work with Hookoomsing. Neither Baker's original proposal nor the revised v ersion fared much better in terms of public acceptance. Significantly better received but still not ultimately accepted was the orthography developed at roughly the same time by Ledikasyon Pu Travayer (LPT), a prominent Kreol literacy initiative. The sim ilarly phonemic LPT code used as a nasal vowel marker from its outset (Rajah Carrim 2008). Meanwhile, Virahsawmy introduced graphie d'accueil , a new proposal in 1985 that re introduced the letters , , , and . Despite its French name, it w as not much closer to French orthography than his previous proposals. The graphie d'accueil was later revised and then replaced in the 1990s by a third orthography, which has itself been revised at least once. In the interest of appeasing the Mauritian p ublic, Virahsawmy has thus had to amend his proposals multiple times, and his 1990s system seeks to strike "an acceptable balance between phonemic principles and general sociological situations." The most recent Virahsawmy system is referred to as graphie and has been largely supported by the Catholic Church of Mauritus, which promotes Kreol literacy and engages in the ongoing translation of religious texts into Kreol using that orthography. Although Virahsawmy views such institutional support as instrumental in the legitimization and standardization of a language and its written rendition, Rajah Carrim (2008) suggests that a potential pitfall of the Church being its primary patron is that it could acquire a distinctly Christian indexicality and t hus limit its sphere of usage as perceived by the general


79 population. The most recent efforts towards Kreol orthographic standardization are aimed at a grafi larmoni based on a "harmonic" hybrid of the and the LPT orthography. It currently uses or to mark vowel nasality, and ordinary Mauritian citizens are even invited to provide their comments and suggestions to the Ministry of Education. The project remains in the consultation phase (Rajah Carrim 2008). A 2000 census records that approximately 14% of the Mauritian population is illiterate, which was defined as unable to "read or write a simple sentence in any language." The highest illiteracy rates were found among women, peo ple living in small villages, and people aged 60 or older. In a recent pilot study, Rajah Carrim (2008) gathered 79 native Mauritians via personal contacts and indexed them for age, gender, ethnicity, and religion. Although there was a higher proportion of literate participants in this study than the national average, their levels of education ranged broadly from never having attended school at all to having earned postgraduate degrees. All subjects who self reported as illiterate were 40 or older, belon ged to the working class, and had never had access to education. Until fairly recently, public literacy training has focused on the acquisition of French, English, and in some cases, an ethnic heritage language such as Hindi or Mandarin. The lack of a si ngle official orthography ca uses many Mauritians to feel awk ward about writing in Kreol, and it is "conceivable" to find a Mauritian who describes himself/herself as "literate" with the caveat that his/her literacy does not extend to Kreol. Although the L PT has been promoting formal Kreol literacy in a codified phonemic orthography among the working classes, most writing in Kreol remains a mere side effect of literacy in French and is therefore largely based on makeshift and ad hoc adaptations of French pa tterns (Rajah Carrim 2008). Each subject was asked the same three questions. The first was whether or not they ever write in Kreol. One complication that the researcher quickly confronted was popular notions of


80 what exactly qualifies as writing. Many p articipants, especially the younger ones, used Kreol in text messaging and e mails but responded as if they never wrote in Kreol. This belies a common perception that such informal and electronically mediated forms of communication are not "real" writing. It was only after the follow up question specifically addressing electronic media that a significant number of subjects amended their initially negative responses. With text messaging and e mails counted as forms of written Kreol, 38% of interviewees re ported writing in Kreol. Crucially, texting and e mailing make of the bulk of that proportion. Only ten people reported using written Kreol beyond those spheres, including one extreme outlier who turned out to be a prolific author of songs and short stor ies in the language. As one might expect, this meant that most of those recorded as writers of Kreol were in the younger age groups, with almost half of all 13 to 19 year olds and over half of all 20 to 39 year olds comprising the vast majority of Kreol writers (Rajah Carrim 2008). Use of written Kreol falls sharply after 40 years of age, and nobody at all under the age of 13 writes in Kreol. Rajah Carrim suggests that th e latter result can be explained by a few crucial observations. First, most of th e preteen respondents were working class and lacked access to mobile phones. Second, at that age, most are still immersed in acquiring literacy in French and English. Finally, children arguably lack the nuanced "creativity" that lends itself to adapting the patterns of the prestige language(s) to Kreol in an ad hoc way without any normative standard, so they rely more on one of the formally taught languages in written communication. For some teenagers and adults, however, the flexibility granted by the l ack of any set formal orthography for Kreol actually contributes to the very appeal of writing it, precisely because it allows greater creativity. This is briefly compared to the role of electronic communication in Jamaican Creole, in which texting among Jamaican youth are contributing to a set of norms emerging bottom up


81 without any intervention from linguists or government officials (Rajah Carrim 2008). Meanwhile, the sharp drop in Kreol writing to only 19% among those 40 and above may be explained not only by the technological generation gap but also a pedagogical one. Since the drive towards the formal teaching of Kreol spelling is a fairly recent movement, many of them were given at best a cursory glance at written Kreol in school. So, since they we re not formally taught how to write Kreol, they do not feel confident attempting to do so and even less justified in claiming to be "literate" in it. For many, this difficulty and awkwardness due to a shortage of exposure fosters a purely practical prefer ence "that invokes neither support nor condemnation" of Kreol as a language. Nevertheless, for others, a clear value judgment is also invoked, as Kreol is compared to French as a language suitable for respectable written discourse (and found wanting). Kr eol is view as not connoting the refinement and socio economic success that French indexes, and the former is thus deemed unworthy of a written register. In some cases, this prejudice is complicated by the lack of a single official orthography, which furt hers the perception of Kreol as somehow broken. In this ambience, the predominance of text messaging and e mails in the distribution of written Kreol might become a double edged sword. It could be the critical path through which significant headway towar ds legitimization is finally made, or it could instead become further stigmatized due to its association with youth culture and the often unabashed informality of modern electronic communication (Rajah Carrim 2008). The second question in the study inquir ed about the specific spelling habits of those who did write in Kreol. The question of how they write in Kreol often had to be clarified by asking if, for instance, they used the letters or , which are characteristic of very phonemic systems (since French orthography, on which a more etymological scheme would be based, rarely uses those graphemes). The most popular codes, used by 45% of Kreol writers, resembled the LPT


82 orthography and were therefore on the phonemic end of the continuum. Only 34% p referred a system with a much higher degree of etymological proximity, and 7% reported using a system that lay closer to the middle of the phonemic etymological spectrum. It should be noted that these were overwhelmingly grassroots orthographies, with eac h individual developing habits which served as conventions via self regularization. The researcher(s) also identified two main motivations behind the choice of a more phonemic orthography: convenience and identity. The latter refers to the capacity of an orthography to distance Kreol from F rench, thus asserting the former as its own language for its own nation. Three respondents stated this motivation outright, calling their phonemic system(s) simply "the Kreol way" to write Kreol. Most of the phonemic spellers under 40 years of age and all of those over that age, however, cited convenience as the reason for their choice. For many, phonemic spelling is simply easier and more efficient (Rajah Carrim 2008). They seem to assume, however, a definition of s implicity that a few older participants did not appear to share. At least one subject reported preferring the more etymological approach for the very same reason that most of the pro phonemicists gave for their preferences. For instance, Eric referred to spelling that is more loyal to French etymons as "easier to read," while Raymond the stark contrast between the generations regarding the role of written Kreol in their schooling. The elder generations were given little or no instruction in Kreol literacy. Instead, they were only ever taught to read and write in French and possibly English. For most of their lives then, including their formative years, that was the only literacy with which they were at all familiar. When confronted with a "modern" orthography, then, the sheer unfamiliarity of it all makes it seem too opaque and complicated. The younger generations, on the other hand, h ave at least


83 been introduced to a distinctly Kreol literacy in their education, and that inoculation of sorts allows them to grasp it much more easily (Rajah Carrim 2008). The final question was aimed at eliciting the subjects' personal opinions on whethe r or not literacy in Kreol should be promoted. While most answered in the affirmative, a substantial number of conditional or even downright negative reactions were recorded. Among the 15% of responses that outright said no to encouraging Kreol literacy, three basic notions emerged. First was the claim that Kreol was just too limited in its social mobility and scope of usage, especially outside of Mauritius. Second, some naysayers argued that there was no need to promote Kreol literacy, because French l iteracy automatically instills, as a naturally incidental byproduct, the ability to encode Kreole in French conventions. Thirdly, some participants perceived Kreol as intrinsically oral and unfit for a written register. One critic called Kreol "not even a language," while another insisted that to write in Kreol is a sign of cultural and intellectual decay (Rajah Carrim 2008). The overall largest group favored the promotion of Kreol literacy but put forth conditions that must be met. Those conditions inc luded the establishment of a single orthographic standard and the restriction of written Kreol to certain domains. There was also some concern that Kreol literacy would come at the expense of full literacy in French and/or English, as if the thre e would s omehow compete for rather finite mental space. One outlier, the prolific author of Kreol songs and short stories mentioned earlier, quite contrarily saw strength in the diversity of norms and argued that flexibility of spelling conventions should not be d iscouraged (Rajah Carrim 2008). The second largest group, those who unconditionally supported the promulgation of written Kreol, justified their advocacy by raising any of three basic arguments. First, it would enable people that can only speak Kreol, ma ny of whom are elderly as well as young children, to


84 become literate. Second, orthographic standardization would lend legitimacy and prestige to the language itself as separate from French. Third, it would grant Mauritians a unique written voice via whic h national identity and authenticity could be asserted. Similar ideas of authenticity belie the responses to the question of what the specific nature of an official Kreol orthography should be. Of the seven who had definite opinions on the matter, six fa vored the phonemic LPT system or at least something very much like it. In describing such a system, telling phrases such as "the true Creole" arose often, revealing that a phonemic orthography is held to be more authentic by these respondents. The only s ubject of the seven who argued in favor of a more etymological scheme was Eric, the same one quoted above as finding the "modern" orthography too difficult compared to his "Frenchified" code (Rajah Carrim 2008). The researcher behind this pilot study surm ised "an image of an official orthography emerging." It may be worthwhile to note some of this author's own experience in his field work on Mauritian Kreol in a classroom setting. After having gathered phonetic data via a word list and analyzing the phon ology, a "practical orthography" was promptly and easily adopted for all further investigation. While there was the occasional uncertainty, the consultant demonstrated generally confident mastery of this system. Although the code in question is largely p honemic and LPT like, making frequent use of and for instance, there were a few features that appeared suspiciously French, such as the spelling of /u/ as rather than just as well as frequent use of to represent / / (Carpooran 2011). Still, one characteristic that particularly impressed this author was the rule that a single co da was a silent marker which nasalized the preceding vowel, while a double in the same position signaled both a preceding nasalized vowel and a separa tely pronounced nasal consonant. While the case of Mauritian Kreol should by now seem fairly typical in terms of the role


85 played by asubau and the prestige of former superstratal languages, two novel observations may be made as well. First, the lack of s tandardization is revealed to be both a cause and a result of continued skepticism regarding the legitimization of written Kreol. The lack of any agreed upon standard spelling system exacerbates the perception of Kreol as strictly oral or otherwise less o f a "language" than French or English, and that perception in turn hinders further progress towards the synthesis and official adoption of a unified orthographic standard. A similar feedback loop could also apply to the restriction of Kreol t o a limited s cope of usage that was astutely stated by some participants in the study. Its limited applicability discourages further standardization, which in turn prevents the expansion of Kreol's sphere(s) of usage. Second, changes in technology and educational pra ctices seem to be playing an especially prominent role in the Mauritian orthographic landscape, since they have both conspired to create quite a noticeable generation gap in public opinion. In such an environment, the indexicalities of e mails, text messa ging, and youth culture could easily transfer to the orthographies therein used, and in Mauritius, that is clearly what has happened. The informality prevalent in those media of communication has led to some arguing that written Kreol should be limited to correspondingly informal contexts. Meanwhile, curricula in public education has been shown to be capable of wielding great influence on what people consider easy or difficult. The comparatively recent integration of Kreol literacy training into public s chool has likely turned the tide considerably, and its ramifications will probably become clearer as today's youth beco me tomorrow's older generation.


86 CHAPTER 7 CONCLUSION In this work, I have attempted to provide a survey of the concepts and issues that are likely to prove relevant to a project of orthographic development. I first endeavored to shed some light on how exactly orthography relates to language and noted the shortage of normalized or even semi normalized analytical lenses through which t o examine orthograph i c systems from at least two broad perspectives: the functional/linguistic and the socio cultural. Next, I provided a brief overview of established terminology and arranged those terms into a typology of scripts. Then, I proposed a te rminological framework and a methodology by which an orthography may be analyzed from a fun c tional/linguistic standpoint, which is probably the most robustly original contribution made in these pages. That was followed by a counterpart apparatus for looki ng at an orthography from a socio cultural perspective, introduced by a few historical illustrations aimed at demonstrating the motivation behind the key questions that make up that apparatus. Finally, an exploration of three contemporary case studies as revealed by various experiments provides examples not only of the ways in which the factors previously explored can manifest themselves and influence the spread of a spelling system, but also of how linguists may go about identifying and contextualizing th e most important considerations in any particular orthographic project. From this survey, a few broad and perhaps counter intuitive conclusions emerge. First, an ideal alphabetic orthography is not necessarily one with strictly isomorphic correspondences that represents speech at the shallowest phonemic level, as demonstrated by the shortcomings of the Dschang tone marked orthography. This is not to suggest, however, that extremely shallow orthographies are necessarily ill advised in all cases, as shown by the findings of Seymour et al.


87 (2003) Rather, optimal orthographic depth is a linguistic parameter whose ideal setting varies according to certain phonological properties of the host language. The exact range of such features seems a fertile field for future investigation. Second, the tension between ausbau and abstand is an extremely frequent factor in influencing the nature and/or success of spelling systems. It was the common thread running through the discussion of Sranan, Wolof, and creoles both Haitian and Mauritian. Thirdly, the interaction of sociolinguistic and functional considerations in orthographic development is not always monodirectional. Spheres of usage and standardization especially can both affect and be affected by orthographic ha bits and norms, as evidenced by the feedback loop seen in the case of Mauritan Kreol. Finally, the most successful orthographies, at least in terms of popular use, need not always be implemented top down. Wolofal, the single most widely used spelling sys tem for Wolof, is an unregulated code that was developed bottom up. Where such grassroots literacies exist, those seeking to codify and implement a successful standard orthography might be advised to draw their inspiration from these more naturalistically devised systems. While the relative weights of the various functional/linguistic and socio cultural factors are not one size fits all and must be contextually determined, I have attempted to offer a semi normalized framework for the much more straightfor ward estimation and discussion of where any particular orthographic proposal stands on each of these spectra. While the same diagnostics used to evaluate a system along a particular continuum could sometimes double as a potential aid in determining the co mparative weight of that continuum itself, the best analogy for my goals in writing this may be Optimality Theory. If an optimal orthography were the final phonological output, I would hope to have at least listed and defined the basic constraints that or thographic designers can then rank according to what they observe in the relevant language and culture.


88 LIST OF REFERENCES Bao Diop, Sokhna. Etude Comparative Entre Les Deux Systèmes d'Écriture Du Wolof: Deux Orthographes Pour Le Wolof . Éditions universit aires européennes, 2011. Web. Bird, Steven. "Strategies for Representing Tone in African Writing Systems." Written Language & Literacy 2.1 (1999): 1 44. Web. < > -. "When Marking Tone Reduces Fluency: An Orthography Experiment in Cameroon." Language and speech 42.1 (1999): 83 115. Web. < > Blommaert, Jan, Nathalie Muyllaert, Marieke Huysmans, and Charlyn Dyers . "Peripheral Normativity: Literacy and the Production of Locality in a South African Township School." Linguistics and Education 16.4 (2006): 378 403. Web. < > Blommaert, Jan. Grassroots Literacy: Writing, Identity and Voice in Central Africa . Routledge, 2008. Web. < AgAAQBAJ > Carpooran, Arnaud. Lortograf Kreol Morisien . Phoenix, Mauritius: Aka demi Kreol Morisien, 201 1. Web . < http://ministry .pdf > Cook, Vivian. "Know ledge of Writing." International Review of Applied Linguistics in Language Teaching 39.1 (2001): 1 18. Web. < in.aspx?direct=true&AuthType=ip,uid&db=ufh&AN=6506 518&site=ehost live > Evers, Cécile. "Orthographic Policy and Planning in Sénégal/Senegaal: The Détournement of Orthographic Stereotypes." Working Papers in Educational Linguistics 51 (1921): 2011. Web. < > Fal, Arame. Alphabétisation En Wolof : Guide Orthographique . Dakar: A. Fal, 1991. Print. Hansen, Hardy, and Gerald M. Quinn. "Introduction." Greek: An Intensive Course. Fordham Univ Press, 1992. 1 16. Print . i, Vladimir. "Formes Parl é es, Form es Écrites et Systèmes Orthographiques d es Langues." Folia Linguistica: Acta Societatis Linguisticae Eur opaeae 5 (1971): 185 93. Web. < 1 2/flin.1969.5.1 2.185/flin.1969.5.1 2.185.xml >


89 Jahr, Ernst "Parallels and Differences in the Linguistic Development of Modern Greece and Modern Norway." Language Conflict and Language Planning. Berlin: Mouton De Gruyter, 1993. 83 97. Print. Katz, Leonard, and Ram Frost. "The Reading Process is Different f or Different Orthographies: The Orthographic Depth Hypothesis." Advances in Psychology 94 (1992): 67 84. Web. < 892 > Kloss, Heinz. "'Abstand Languages' and'Ausbau Languages'." Anthropological linguistics (1967): 29 41. Web. < id=3739256&uid=37531&uid=37526&uid=62&uid=3739600&uid=67 > Lillis, Theresa. "Writing in Sociolinguistics." The Sociolinguistics of Writing. Edinburgh: Edinburgh S ocioli nguistics, 2013. 1 19. Print . Lüpke, Friederike, and Sokhna Bao Diop. "Beneath the Surface? Contemporary Ajami Writing in West Africa, Exemplifie d through Wolofal." (2014) Web. < f > Lüpke, Friederike. "Orthography Development." Handbook of Endangered Languages. Eds. Peter K. Austin and Julia Sallabank.Cambridge Univer sity Press, 2011. 312 336. Web. < > Matsukas, Aristarhos. "Pronunciation Guide." Complete Greek. Blacklick, OH: McGraw Hill, 2010. xxii xxx. Print. Te ach Yourself. Rajah Carrim, Aaliya. "Choosing a Spelling System for Mauritian Creole." Journal of Pidgin and Creole Languages 23.2 (2008): 193 226. Web. < > Rogers, Henry. Writing Systems: A Linguistic Approach (Blackwell Textbooks in Linguistics) . Wiley Bla ckwell, 2004. Print . Sebba, Mark. Spelling and Society: The Cult ure and Politics of Orthography Around the World . Cambridge University Press, 2007 . Web. < ling_and_Society?id=JHgsf ADZF9IC > Seymour, Philip HK, Mikko Aro, and Jane M. Erskine. "Foundation Literacy Acquisition in European Orthographies." British Journal of psychology 94.2 (2003): 143 74. Web. < > Upward, Christopher, and George Davidson. The History of English Spelling . Malden, MA: Wiley Blackwell, 2011. Print.


90 Van Den Bosch, Antal, Alain Content , Walt er Daelemans, and Beatrice de Gelder . "Analyzing Orthographic Depth in Different Languages using Data Oriented Algorithms." Precedings of the 2nd International Conference on Quantitative Linguistics (1994) . Web . < >


91 BIOGRAPHICAL SKETCH Gregory H. Bontrager's interest in human languages began in high school, from which he graduated with college credits in both Spanish and French via Adva nced Placement testing. He earned his Bachelor of Arts degree in Spanish from Florida Gulf Coast University in 2010 and is currently pursuing graduate studies in linguistics at the University of Florida. Besides Spanish and French, Bontrager has studied four semesters of German and two semesters of Mandarin Chinese. Through self instruction, he has also acquired a solid f oundational knowledge of Italian, Latin, and classical Greek.