USING MULTILEVEL RECONSTRUCTION APPROACH
FOR MACHINE TRANSLATION FROM ENGLISH TO CHINESE
VIA LINGUISTIC CANONICAL FORM
By
KEFENG CHEN
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1992
ACKNOWLEDGEMENTS
The author wishes to acknowledge his advisor and
supervisory committee chairman, Dr. Julius T. Tou, for his counsel, guidance, and assistance throughout the entire course of this study.
The author would also like to express his deep appreciation to Dr. Chauncey C. Chu for his invaluable advice
in linguistics and for correcting the manuscript of this dissertation. The author is also greatly indebted to all the other members of his supervisory committee, Dr. Jose C. Principe, Dr. John Staudhammer, Dr. Mark Yang, and Dr. Myron
N. Chang, for their suggestions and advice regarding this dissertation.
He would like to thank all the members and good friends
of the Center for Information Research for their helpful discussions. Thanks are also due to a good friend, Dr. Mingling Hu, for his valuable discussions.
Finally, the author would like to thank his wife for her support, patience, and encouragement throughout these difficult years.
TABLE OF CONTENTS
ACKNOWLEDGEMENTS . . . .
ABSTRACT . . . . . . . . . . . . . . . . . . . . .
CHAPTER 1 INTRODUCTION .
1.1 Motivation for the Research .
1.2 Scope and Research Objectives .
1.3 Approach . .
1.4 Organization of the Dissertation .
CHAPTER 2 MT SYSTEM DESIGN STRATEGIES AND PREVIOUS
RESEARCH .
2.1 Introduction .
2.2 System Design Strategies . .
2.2.1 The Direct Strategy . .
2.2.2 The Interlingual Strategy .
2.2.3 The Transfer Strategy . .
2.3 English-to-Chinese MT Research .
2.3.1 Direct-Approach-Based System . . .
2.3.2 Knowledge-Based System .
2.3.3 Transfer-Approach-Based System . .
2.4 Conclusion from Previous Work .
CHAPTER 3 LINGUISTIC FEATURES OF CHINESE LANGUAGE
THE VIEWPOINT OF ENGLISH THROUGH LCF .
3.1 Introduction . .
3.2 The Word and Word Classes .
3.3 Expression of Grammatical Categories . .
3.3.1 Number . . . . . . . . . . .
3.3.2 Definite and Indefinite Reference
3.3.3 Subordination and Modification
3.3.4 Case Relationships .
3.3.5 Aspect and Voice .
3.3.6 Negation .
3.3.7 Modality .
3.4 The Sentence . .
3.4.1 Subject and Topic . .
3.4.2 Predicate . .
3.5 Summary . . . . . . . . . . . . . . . .
FROM
* . . 24
* . . 24
* . . 25
* . . 30
30
31
* . . 32
* . . 33
* . . 36
* . . 39
* . . 40
* . . 42
* . . 42
* . . 45
* . . 50
iii
Pacte
ii
� . 1
� � 1
3 5 8
CHAPTER 4 MULTILEVEL RECONSTRUCTION TRANSLATION
MODEL . . . . . . . . . . . . . . . . . . . . . . 52
4.1 Introduction . . . . . . . . . . . . . . . . . 52
4.2 The Overall Structure of the Model . . . . . . 56
4.2.1 Sentential Reconstruction Module . . . 56 4.2.2 Phrasal Reconstruction Module . . . . . 56 4.2.3 Lexical Reconstruction Module . . . . . 58
4.3 Sentential Level Reconstruction . . . . . . . . 59
4.3.1 Syntactical Rule-Based Reconstruction . 59 4.3.2 Pattern-Based Reconstruction . . . . . 62
4.3.3 Structural Feature-Based
Reconstruction . . . . . . 69
4.3.4 Complex Sentence Reconstruction . . . . 80
4.4 Phrasal Level Reconstruction . . . . . . . . . . 82
4.4.1 Adverbial Position . . . . . . . . . . 83
4.4.2 Multiple Word Phrase Reconstruction . . 85 4.4.3 Multiple Adverbial Phrases Ordering . . 87
4.5 Lexical Level Reconstruction . . . . . . . . . . 91
4.5.1 Noun . . . . . . . . . . . . . . . . . 91
4.5.2 Verb . . . . . . . . . . . . . . . . . 92
4.5.3 Preposition . . . . . . . . . . . . . . 98
4.6 Summary . . . . . . . . . . . . . . . . . . . . 99
CHAPTER 5 IMPLEMENTATION OF THE MULTILEVEL
RECONSTRUCTION MODEL . . . . . . . . . . . . . . . 104
5.1 Introduction . . . . . . . . . . . . . . . . . 104
5.2 Linguistic Knowledge Representation . . . . . 108
5.3 Translation Knowledge Representation and
Knowledge Base Organization . . . . . . . . . 115
5.3.1 Production System for Knowledge
Representation . . . . . . . . . . . . 116
5.3.2 Knowledge Base Organization and Control
Strategy . . . . . . . . . . . . . . . 124
5.3.3 Three-Level Reconstruction Rules . . . 129
5.4 Implementation of AUTOTEC-GEN . . . . . . . . 132
5.4.1 Sentence Structure Determination . . . 135 5.4.2 Phrasal Level Processing . . . . . . . 146 5.4.3 Lexical Level Implementation . . . . . 153
5.5 Summary . . . . . . . . . . . . . . . . . . . 168
CHAPTER 6 RESULT AND CONCLUSION . . . . . . . . . . . . 172
6.1 Illustrative Example of Translation . . . . . 172 6.2 System Test and Discussion . . . . . . . . . . 184
6.3 Summary . . . . . . . . . . . . . . . . . . . 187
6.4 Conclusion . . . . . . . . . . . . . . . . . . 190
6.5 Areas for Future Work . . . . . . . . . . . . 191
APPENDIX A TRANSLATION SAMPLES . . . . . . . . . . . . 194
APPENDIX B TRANSLATION WITHOUT SENTENTIAL LEVEL
PROCESSING . . . . . . . . . . . . . . . . . . . . 197
REFERENCES . . . . . . . . . . . . . . . . . . . . . . 200
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . 208
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
USING MULTILEVEL RECONSTRUCTION APPROACH
FOR MACHINE TRANSLATION FROM ENGLISH TO CHINESE
VIA LINGUISTIC CANONICAL FORM By
Kefeng Chen
August 1992
Chairperson: Dr. Julius T. Tou Major Department: Electrical Engineering
This dissertation presents a multilevel reconstruction approach via linguistic canonical form (LCF) for machine translation (MT) from English to Chinese. English and Chinese
belong to two different language groups. Their linguistic differences are quite large and are a major obstacle for an MT system to achieve high quality. The multilevel reconstruction via LCF approach follows language hierarchical structure and translates English into Chinese in a sentential, phrasal, and lexical order. It provides an effective means to bridge the linguistic gap between English and Chinese step by step along the language structure hierarchy.
The multilevel reconstruction approach first establishes a most suitable Chinese sentence structure for the English
sentence translation. The English sentence pattern, structure
features, and other syntactic and semantic information are used to make the decision. After a sentence structure is established, the phrasal level differences are processed. The phrasal syntactic transformation rules are used to reconstruct English phrasal expression in Chinese order. A larger concept f irst principle is used to obtain correct Chinese order of multiple adverbial phrases. After those two level
reconstructions, the sentence becomes the LCF for English-toChinese translation. In the lexical level, all English words
are substituted with Chinese words from the LCF, and worddependent adjustments are performed in this stage. The lexical reconstruction finalizes the translation.
The multilevel reconstruction approach breaks down the translation process into three stages. In each stage the system concentrates on one linguistic layer and obtains an optimized result. Thus, high. translation quality can be achieved. The hierarchical processing approach also simplifies translation rule writing, knowledge-base organization, and programming. It gives the system the flexibility to expand and absorb new sentence patterns and expressions, and also enhances system maintainability.
A prototype system AUTOTEC-GEN has been built based on
the multilevel reconstruction approach. A text corpus has been tested and the results are judged quite well by native speakers.
vii
CHAPTER 1
INTRODUCTION
1.1 Motivation of the Research
A machine translation (MT) system basically consists of two major functional parts: a source language (SL) analysis
part and a target language (TL) generation part (Hutchins, 1986; Slocum, 1985; Tucker, 1987). Depending on the strategy used in the system, these two parts can be either separate or
integrated. This dissertation is mainly concerned with the problem of MT from analyzed SL to TL generation, specifically, from analyzed English sentence to Chinese sentence generation in an English-to-Chinese MT system.
The success of any machine translation system is measured in terms of the quality of its output, that is, the output of
its generator. A good MT system must be able to appreciate the nuances of the word choices in the target language and the inferences that are invited by alternative syntactic phrasings and sentential structures. If such choices are not made
deliberately with an appreciation of their consequences, then the sense of the original text will be distorted. This fact is not well appreciated by the bulk of the MT community (McDonald, 1987).
2
The reason for focusing on MT generation is that
relatively little research has directly addressed the task of TL generation in MT systems, and most MT research thus far has been concentrated on SL analysis (Hutchins, 1986; Y. Liu, 1984). However, from the viewpoint of translation quality, both the analysis and generation play an important role in MT
systems. overlooking either one of them will result in a failure to achieve high-quality translation.
English-to-Chinese MT has been studied for more than 30 years (Y. Liu, 1984; Hutchins, 1986). Many systems has been reported (Li & Chang, 1988; Tan, 1988; Yu & Mao, 1988; Huang, 1987; Pan, 1987; Su, Chang & Hsu, 1987; Jin & Simmons, 1986;
Dong, 1984; Liu, 1982). But all these systems either suffer from low translation quality or are restricted in a very narrow domain and syntactic phenomena. The lack of effective
and comprehensive generation algorithms for synthesizing a TL sentence from the analyzed SL certainly is one of the reasons. Since English and Chinese belong to different language groups, their linguistic differences are quite large. The English syntactic structural interference and word order interference (Newmark, 1981) would be severe in the translation process. The MT system will likely produce unnatural and English-like Chinese sentences when the generation scheme is not carefully
plotted. This problem can not be solved by using simple generation design. It is the purpose of this dissertation to
3
develop a comprehensive approach to Chinese sentence generation for an English-to-Chinese MT system.
1.2 Scope and Research Obiectives
Translation is not a pure linguistic operation but "rather must be thought of as a psycholinguistic, sociolinguistic, and pragmalinguistic process" (Wilss, 1982,
65). If the MT system is designed to work on a general subject, it must have some general world knowledge and cultural background information available in the computer. To put general world knowledge and cultural information into a
computer is still a research topic and cannot be realized today (Brittan, 1987; Nagao, 1989). When the source language
documents to be translated are limited to a specific domain of science and technology, there will be no difference in
cultural background between the source language and the target language. The translation problems can be limited to the syntactic, semantic, and contextual information plus cultureindependent, domain-specific knowledge.
Translation methods vary from literal to free creative (Nida, 1976; Newmark, 1981). Nagao (1986) summarized that translation can be achieved at the following four levels:
1) Free and creative translation which aims for-the same
mental reactions by the readers of source and target
languages.
2) Sentence-by-sentence translation in the free sentential style. Language particularities of both
languages are fully considered in translation.
4
3) Literal translation. The sentential structures of the source language remain in the target language. However, the selection of translation words is correct from the
standpoint of semantics.
4) Mechanical translation. The sentential structure of the target language is a crude mapping of the one in the source language, and the selection of translation words
is almost one to one.
Translation at the first level is highly creative. It is
usually used in literary works, such as poetry translation. Machine translation is not expected to achieve this goal in the near future (Slocum, 1986).
Translation at the fourth level results in strings of words where meaning is usually obscured. This translation level is quite unsatisfactory between languages of widely different characteristics such as English and Chinese. Many of the current commercial MT systems are at this level. Systems
of this level usually require heavy revision of the translated text by a human translator. The sentential structures of the input sentences and the meanings of words are strictly limited.
Most of the English-to-Chinese MT systems reported are at the third level, the literal translation level, or at best between the second and the third levels.
The objective of this dissertation is to develop a highquality translation approach aimed at translating science and technology papers from English into Chinese at the sentential translation level.
5
1.3 Approach
To achieve high-quality translation in an MT system, there must be effective models of choices of linguistic
structures and devices that the TL offers to relay the SL meaning. There must be a rich understanding of the grammatical capacity of the TL, of the dependencies between the alternatives the TL offers, and of how they may be navigated
by the procedures of the generator as it constructs the TL text.
Although generation is not well studied in the MT field, artificial intelligence (AI) research in generation independent of MT has made considerable progress. Artificial intelligence generation systems have been developed that are capable of quite delicate decisions of phrasing and content
flow (McDonald, 1980; Appelt, 1985; Jacobs, 1985; McKeown, 1985; Hovy, 1988). They are now in a position to supply the generation capacities that strong MT systems will need.
Machine translation has three general approaches: direct approach, interlingual approach, and transfer approach. In the direct approach, the TL generation is integrated with the SL
analysis. They do not have clear boundaries. In this approach, the generation is not an independent process.
In the interlingual approach, the generation process is completely SL independent. The TL text is generated from an
intermediate language (interlingua). This is similar to AI generation. Actually, when the interlingua is independent
6
enough, the MT generation becomes closer to AI generation. For example, the conceptual dependent theory (Schank, 1975) based AI generation and MT generation are not much different
(Ishizaki, 1988; Luckhardt, 1988; Carbonell, Cullingford & Gershman, 1981). The major problem with the interlingual approach is that "linguists are not yet able to specify an interlingua"l (Slocum, 1989, 22). The other is that after the SL text is transformed into interlingua, the features shared
by SL and TL, which can be used by generation, are lost (Nagao, 1987).
The transfer approach does not haye the problems which
haunt the direct and interlingual approaches. The transfer approach separates the analysis and the generation process and still keeps certain SL information. It becomes a commonly used approach in MT system design, especially for single pair language translation. In this dissertation, the translation will be based on the transfer approach.
Generation in an MT system deals with how to generate a corresponding TL sentence from an analyzed SL sentence. Tou (1988) developed a linguistic canonical form (LCF) approach which will break the cultural barrier in translation. Throughout this dissertation, we have developed a multilevel reconstruction approach for generating Chinese sentences via LCF from analyzed English sentences. Using the LOF concept and 1 The linguistic canonical form here is defined by Dr. J. T. Tou in his 1988 paper. It is different from the meaning used in linguistics.
7
following the language sentential, phrasal, and lexical hierarchical structure (Wirth, 1985), the generation of Chinese sentences from analyzed English can be divided into three levels of processing (Chen & Tou, 1991). They are (1) sentential level processing, which determines the TL sentence structure for the SL sentence translation; (2) phrasal level processing, which renders all SL phrases into appropriate TL
phrases (after this level of processing, the sentence will become LCF) ; and (3) lexical level processing, which replaces all SL words with appropriate TL words and makes word-specif ic adjustment.
A sentence structure is the foundation of a sentence. English and Chinese may have different ways to express an entity or idea. The sentential reconstruction establishes the most pertinent sentence structures for Chinese translation.
The reconstruction is based on English sentence syntactic and semantic features and information. These include sentence pattern, voice, verb, and the characteristics of post verb elements.
The phrasal reconstruction takes sentence constituents as its working elements. It resolves the phrasal level
differences between English and Chinese and makes the English phrases expressive in Chinese convention. Through the phrasal reconstruction, the English sentence becomes a string of English words which are arranged in Chinese grammar and word
8
order. This is referred to as LCF of Engl i sh-to -Chinese translation.
The lexical reconstruction substitutes all English words with Chinese words. The English words are chosen from a
bilingual dictionary following word selection principles. The lexical reconstruction performs necessary post processing, such as realizing tense and aspect, and handling Chinese wordspecific requirements. It finalizes the translation.
The multilevel reconstruction via LCF translation
approach focuses on improving the translation quality of MT systems. It follows the sentential, phrasal, lexical hierarchical structure of language and reconstructs TL
sentences in the same order. In the multiple level processing, the differences between English and Chinese can be bridged one level at a time, from sentential level to lexical level. The
translation process can be controlled, and high quality can be achieved.
1.4 organization of the Dissertation
Chapter 2 presents a brief review of three major MT system approaches, one direct and two indirect translation approaches, and their TL generation strategies. Some previous English-to-Chinese MT research is also reviewed.
Chapter 3 discusses the language features between English and Chinese. English and Chinese belong to two different language groups. Their differences exist in all linguistic
9
levels. To identify these differences will have direct effect on the MT system design. In this chapter differences between English and Chinese are discussed from lexical level to phrasal level and to sentential level.
Chapter 4 presents a novel multilevel reconstruction via LCF translation model. This model aims at producing highquality, natural Chinese sentences. It starts with an analyzed English sentence, and it generates a corresponding Chinese sentence following the language hierarchical structure in a sentential, phrasal, and lexical order. In this hierarchical process, large linguistic gaps can be bridged well.
Chapter 5 presents the design of a generator, AUTOTECGEN, in an English-to-Chinese MT system. The generator has been developed as a practical example of the multilevel reconstruction translation model presented in Chapter 4. The techniques of linguistic information representation and other knowledge representation are discussed. The three levels of the reconstruction process are given in detail.
Chapter 6 starts with a sentence translation example. It then summarizes the major accomplishments in this dissertation and provides some suggestions for further research.
The Appendix presents some experimental results of translated Chinese sentences from analyzed English sentences using AUTOTEC-GEN and three level reconstruction rules.
CHAPTER 2
MT SYSTEM DESIGN STRATEGIES AND PREVIOUS RESEARCH
2.1 Introduction
A great amount of research has been carried out in the past 40 years for machine translation (Hutchins, 1986;
Slocum, 1985; Tucker & Nirenburg, 1984). Three system design strategies have been developed and used in most MT systems. Among them, the direct strategy was the first. It was used mostly in the early MT systems. The interlingual and transfer strategies came out later. However, they became more popular than direct strategy in MT system design. Most English-toChinese systems used either direct or transfer strategy. In this chapter, the three MT system design strategies are introduced, and some English-to-Chinese MT systems are reviewed.
In section 2.2 the three system design strategies are described from the viewpoint of how an MT system is configured and what stages are involved in an MT system. In section 2.3 some English-to-Chinese MT systems are reviewed. Problems encountered with the previous English-to-Chinese MT systems are discussed in Section 2.4.
11
2.2 System Design Strategies
Translation may be regarded as analyzing an input in the
SL and synthesizing an equivalent output in the TL. With regard to the analysis and synthesis stages, the MT system design has three types of strategy.
2.2.1 The Direct Strategy
The direct translation strategy consists of a mapping from source to target language without an intermediate representation (Fig. 2.1). Direct translation systems do not contain a meaning processing component. Only minimal and source language-specific disambiguation is performed. The systems are usually designed in all details specifically for
one particular pair of languages. The basic assumption is that the vocabulary and syntax of SL texts need not be analyzed any more than strictly necessary for the resolution of
ambiguities, the correct identification of appropriate TL expressions, and the specification of TL word order. Thus, if the sequence of SL words is sufficiently close to an
acceptable sequence of TL words, then there is no need to identify the syntactic structure of the SL text. A primary characteristic of direct translation systems is that no clear distinctions are made between stages of SL analysis and TL synthesis. The direct translation strategy is the earliest approach used in MT system design. The majority of MT systems of the 1950s and 1960s were based on this strategy (Hutchins,
12
1986). Their differences were the amount of analysis and/or restructuring they incorporated. Some early direct translation systems included IBM's Mark I,II (Shiner, 1958; Bower & Fisk, 1965), Georgetown University's GAT (Dostert, 1955), and
SYSTRAN system (Toma, Kozlik & Perwin, 1970). Some English-toChinese translation systems also adopted the direct translation approach.
SL Analysis and Synthesis TL
text Itext
SL-TL
dictionaries
and grammars
Fig. 2.1 Direct Translation System Structure
2.2.2 The Interlinqual Strategy
The interlingual strategy assumes that it is possible to convert SL texts into universal, language-independent representations. From such interlingual representations, other language texts can be generated. The common argument for translation via an interlingua is one of economy of effort in a multilingual environment (Andreev, 1967).
In interlingual systems, translation from SL to TL is in two distinct and independent stages (Fig. 2.2). In the first stage SL texts are fully analyzed into interlingual representations, and in the second stage interlingual forms
13
are the sources for generating TL texts. Procedures for SL analysis are intended to be SL-specific and are not devised for any particular TL in the system. Target language
generating is intended to be TL-specific. In principle, interlingual approach can dispense with bilinguality. If there are n languages involved and translation is to be from and into each of them, then a system would need n(n-l) binary direct translation programs. However, if translation is via an interlingua, the system would need n parsers and n generators.
Analysis Interlingual Synthesis TL
text representation
SL SL-TL TL
dictionaries dictionary dictionaries
and grammars and grammars
Fig. 2.2 Interlingual System Structure
Interlingual systems differ in their conceptions of an interlingual language: a logical artificial language or a natural auxiliary language such as Esperanto, a set of semantic primitives common to all languages, a universal vocabulary, etc. Interlingual MT projects have also differed according to the emphasis on lexical aspects and on syntactic aspects. Some concentrated on the construction of interlingual lexica, such as CLRU's system (Masterman, 1957); others have concentrated on interlingual syntax, such as DLT (Witkam,
14
1983) and TRANSLATOR (Nirenburg, Raskin & Tucker, 1985). One
problem with the interlingual approach is that the proper interlingua can hardly be specified (Slocum, 1985).
2.2.3 The Transfer Strategy
In the transfer strategy, a SL sentence is first parsed into abstract internal representations (usually some sort of annotated structure). Thereafter, a transfer is made at both the lexical and structural levels into corresponding structures in the TL. In the third stage, the translation is
generated (Fig. 2.3). Three dictionaries are usually needed for transfer: an SL dictionary, a bilingual transfer dictionary, and a TL dictionary. The approach is an improvement over direct translation systems, in which no
structural information is used. It also avoids the problems of the interlingual approach. Since the interlingual approach necessarily requires complete resolution of all ambiguities and anomalies of SL texts so that translation should be possible into any other language, the transfer -approach tackles only those ambiguities inherent in the language in question.
The level of transfer differs from system to system--the representation varies from purely syntactic deep structure markers to syntactico-semantic annotated trees. In the early systems, analysis went no further than surface syntactic structures, therefore, with structural transfer taking place
15
at this depth of abstraction, such as MIT's system (Yngve, 1967). Later (post-1970) transfer systems have taken analysis to deep semantico-syntactic structures, with correspondingly more abstract transfer representations and transfer rules, such as the University of Montreal's TAUM (Kittredge, 1972), University of Grenoble's GETA (Vauquois & Boitet, 1984), University of Texas at Austin's METAL (Bennett & Slocum, 1985) and European community's EUROTRA (King, 1982; Johnson, King & Tombe, 1985).
SL Analysis Transfer Synthesis T
text SL TL tx
repr repr
SL SL-TL TL
dictionaries dictionary dictionaries
and grammars and grammars
Transfer
rules
Fig. 2.3 Transfer System Structure
The basic difference between the direct approach and the two indirect approaches lies in the configuration of the dictionary, the grammar data, and the separation of SL analysis and TL generation process. In the direct system, the main component is a single SL-TL bilingual dictionary incorporating not only information on lexical equivalents but also all data necessary for morphological and syntactic
16
analysis, transfer, and synthesis. In the indirect system, this information is dispersed among separate SL and TL
dictionaries, separate SL and TL grammars, and either the interlingua vocabulary and syntax or the SL-TL transfer dictionary (of lexical equivalences) and a grammar of SL-TL structure transfer rules.
2.3 Enclish-to-Chinese MT Research
English-to-Chinese MT research has been carried out for
several decades. Most of this research has used either the direct translation approach or the transfer approach, since most of the MT systems are single language pair systems.
2.3.1 Direct-Approach-Based System
The direct translation system is concerned mainly with the efficiency of translation and assumes that the SL and TL have similar sentential structures; thus there is no need for
sentential structure change in TL. It usually deals with a very restricted domain. Tan's financial news translation system (Tan, 1988) is such a system.
This system does not have clear English sentence analysis or a Chinese sentence generation process. It substitutes English words with Chinese words as much as possible without analyzing SL text until it becomes necessary to do so.
The process is called multipass substitution. It starts from recognizing special phrases or words in a sentence
17
without any analysis. If a phrase or word is identified, it is substituted immediately. The process of recognition and
substitution is organized according to the number of words which are required to scan a sentence and translate the phrases or words as single units. The order of pass is that the phrases composed of the most words are processed at the first pass, while individual words are processed at the final pass.
There is no general linguistic theory, parsing
principles, or generation strategy involved in the system. The system is totally dependent on well-developed dictionaries, morphological analysis, and text-processing software to gain
credible translations of the English text into a series of reasonably equivalent words and phrases in Chinese. The Chinese sentence generation is in the phrase level and word
level. There is no sentential structure revision. The word order change is kept to a minimum. The translations are mechanical, unnatural, and English-like. This approach is usually limited in a very narrow domain.
2.3.2 Knowledge-based System
Tou's CATEC (Tou, 1988) is a knowledge-based English-toChinese MT system. The aim of CATEC is to translate technical
papers from English to Chinese. It is based upon an innovative idea of linguistic canonical form transformation in order to
incorporate the cultural aspects of a natural language. It
18
assumes that the syntactic structure and sentence composition of a natural language reflect the cultural background of the people. In this system, the SL is first transformed into the linguistic canonical form. The linguistic canonical form of a source language with respect to a target language is the SL expression of thoughts and sentence structures which will be
generated by persons who were brought up under the culture of the target language.
The linguistic canonical form is closer to the target language either in linguistic aspect or in cultural aspect. It is easier to translate from the linguistic canonical form than from source texts. The input source language sentences are transformed into canonical form with a knowledge base. Human
interference may occur in this stage. The target language texts are generated from the canonical forms. The system performance heavily depends on the LCF. The translation
quality of an MT system can be greatly improved if the LCF can be generated properly.
2.3.3 Transfer-Approach-Based System
Since most MT systems are experimental in nature, they
tend to use the transfer approach. The transfer approach gives the designer more flexibility in system design, and the systems are easy to maintain and update.
The ERSO's MT system is a typical transfer system (Li &
Chang, 1988). The translation process of the ERSO system
19
contains the analysis, transfer, and synthesis stages. The analysis stage includes morphological analysis, syntactic
analysis, and semantic analysis. The morphological analysis is used to prevent the redundancy of the dictionary entries. It
divides the input sentences into tokens and transforms the inflected words into their base forms. The syntactic analysis generates parse trees from input English sentences through an
augmented transition network parser. The semantic analysis is used to disambiguate the syntactic ambiguity of the sentences.
The transfer stage performs lexical transfer and syntactic transfer. The lexical transfer translates English
words into Chinese words. The selection of appropriate Chinese translation is done by applying the syntactic information, semantic information, and built-in dictionary word selection functions. The syntactic transfer transforms the English sentence parse tree into a Chinese sentence parse tree. The final synthesis stage's syntactic synthesis further refines the transferred sentence to fit Chinese grammar. The
morphological synthesis generates the corresponding Chinese phrase markers of tense, aspect, and voice elements
corresponding to those in the English sentence, and constructs them into a well-formed Chinese sentence.
The ERSO system considers certain sentential level structure transfers, such as converting English "it is . that . 11 and "it is ADJ -ing . 11 sentence structures into
corresponding Chinese sentence structures. However, it treats
20
them as a special case and does not deal with the sentential structure transfer in a systematic way. Its concerns of syntactic transfer are mostly in the phrasal level. It lists the passive sentence translation as an unsolved problem and
can not determine when to use Chinese passive to translate English passive.
The trans fer-approach-based English-to-Chinese MT systems have similar structure. Their differences lie in their emphasis on different modules.
The MT-H78 (Dong, 1984) emphasizes logical relationships
between constituents. It utilizes the case grammar concept (Fillmore, 1968) to establish the relationship between a verb
and other sentence elements. Its transfer module transforms SL into a logical semantic structure. Chinese sentence generation is based on the transformational rules of the constituents relationship. Since the translation strictly follows the case relation-based transformational rules, all deviation and nuance of SL will disappear in the TL after the translation.
The ECMT-78 (Liu, 1982) is an MT system for translating
limited domain texts. All SL sentences are analyzed into intermediate representations. A sentence is classified into three layers. The predicate is the highest layer; the sentence components such as subject, object, and adverbial, etc. make
up the second layer; modifiers for sentence components are the third layer. The analysis establishes the layer relationship. The synthesis starts from the lowest layer to adjust phrasal
21
word order. Only the attributive preposition will change
position. The rest of the constituents will stay in their positions in Chinese as in English. The syntactic transfer does not go beyond phrasal level.
other systems such as JFY-II (Z. Liu, 1984), PECMT-86 (Pan, 1987), and QHFY (Yu & Mao, 1988) vary with their
translation domain and scale and differ in their systems emphasis. No one system has extended the syntactic transfer to sentential level.
2.4 Conclusions from Previous Work
Most English-to-Chinese MT systems emphasize analysis (Huang, 1988; Su & Chang, 1990). Their generation schemes are comparatively simple. Their syntactic transfer is almost completely limited to the phrase level. Only Li and Chang's system (1988) has gone beyond phrase level transfer.
The linguistic differences between English and Chinese span from the sentential level to the morphological level. In fact, English is an SVO (Subject Verb Object) language, and Chinese is an SVO, and SOV (Subject Object Verb) mixed language (Li and Thompson, 1981). Translation from English to Chinese must map the English SVO word order into the Chinese SVO or
SOV order. This means that translation from English to Chinese may need major sentential structure reorganization. When the English sentences can be expressed in Chinese SVO order, the generation of Chinese sentences could be limited to phrasal
22
and lexical level processing. However, many English sentences can not be expressed in appropriate SVO sentential structure
in Chinese. To use English sentential structure as Chinese sentence structure in these cases will make the translation awkward , unnatural, or even unacceptable.
Translation in the direct approach does not use a separate generation process to produce TL. It is limited in
word identification and substitution. The largest units a direct system can handle are usually phrases (Tan, 1988). It
is not easy to conduct sentential level reorganization of constituents in the direct systems. Thus, the direct approach is limited to a very small domain.
Most of the previous work for English-to-Chinese MT uses the transfer approach. In the transfer approach, the TL generation will undergo the transfer and synthesis process.
This makes the transfer system capable of handling major sentential structure transformation. However, due to the assumption that English and Chinese have the same sentential
structure, the best efforts most English-to-Chinese MT systems have made on structural transfer are confined to phrase level within the SVO framework. Since SVO word order only covers part of Chinese syntax, the shortcomings of these systems are obvious.
An ideal translation is based on thoroughly understanding the SL (Nirenburg, 1987). In this way, the TL generation can
be started from pure semantic representation, and the TL
23
generation can totally avoid the SL interference. However, total understanding of SL can not be realized in the current stage (Nagao, 1987), and the interlingual approach has many unsolved problems. The transfer approach is more realistic. The multilevel reconstruction via LCF approach presented in this dissertation is based mainly on the transfer strategy. The approach extends the syntactic transfer scope to sentential level. Translation goes through sentential, phrasal, and lexical reconstruction. It uses mainly SL
syntactic information with some common semantic features. This approach overcomes the drawbacks of previous work.
CHAPTER 3
LINGUISTIC FEATURES OF CHINESE LANGUAGE
FROM THE VIEWPOINT OF ENGLISH THROUGH LCF
3.1 Introduction
English and Chinese belong to different language groups. English belongs to Indo-European languages. Chinese belongs to Sino-Tibetan languages. Their differences are quite enormous and spread through all levels of lexicon, phrase, and
sentence. The quality of MT from English to Chinese are usually hindered by those wide linguistic contrasts,
inadequate world knowledge, and lack of cultural information. Following from the discussion of LCF in the preceding chapter, the LCF is used to serve as an inter-language from English-toChinese translation. The task of MT from English to Chinese can be partly considered as how to construct the LCF. Through LCF is an important stage of MT, we will try to look at the linguistic features of Chinese that are relevant to our
English-to-Chinese MT project. In this chapter the linguistic features of Chinese are compared with those of English. The examples are arranged in the following manner:
English
LCF
Translation in Chinese
1All Chinese words are expressed in Pingying.
24
25
In section 3.2 the features of word and word class are
described and compared between English and Chinese. In section 3.3 the features of grammatical categories are discussed. In
section 3.4 the sentential features are discussed. The chapter is concluded in a summary in section 3.5.
3.2 The Word and Word Classes
Chinese words traditionally are divided into I full words (shici) and I empty words (xuci) . Full words are words that have a concrete meaning and empty words are those that have an abstract meaning, and are for the most part employed to show
grammatical relationships. The distinction is similar to that drawn by some modern linguists between content and function words in English.
Full words fall into seven classes: nouns, verbs, adjectives, numerals, measures, pronouns and adverbs. A noun
is a word that can be modified by a demonstrative measure compound: shu I book I can f ollow zhe-ben I this (volume) , I hence it is a noun. Unlike nouns in English, Chinese nouns do not inflect for number. For example shu in Chinese means 'book.' When it appears in a sentence, it can either mean 'book' or 'books.'
(3.1) 1 bought a book/books.
I bought a book/books
Wo mai le shu.
26
A verb is a word that can immediately be negated by bu 'not' and can be followed by a set of typically verbal suffixes like -le 'suffix for perfect aspect,' and -zhe 'suffix for durative aspect.' The category 'verb' in Chinese is quite different from that in English. A superficial difference is that Chinese verbs do not conjugate for tense. At a deep level, however, the differences between Chinese and English are far more subtle. English verbs are sharply distinct from adjectives, both morphologically and
syntactically. That is, English verbs conjugate for tense, voice, and occasionally mood, while only some English
adjectives conjugate for comparison (-er and -est), with others not conjugating at all. At the syntactic level, an English verb forms the nucleus of the predicate while an adjective does not. In addition, a verb is required to occur
with an adjective to form a predicate. There are no such sharp distinctions between any two grammatical categories in Chinese.
(3.2) John is smart.
John smart
John congming
The Chinese equivalent of 'John is smart' John congming does
not have a verb. The predicate is formed by an adjective congming. Adjectives are generally considered to be a species of verb since they can be negated by bu and can function independently as predicates. one of the main differences
27
between adjectives and verbs is that most adjectives can be modified by the degree adverb hen 'very' but verbs cannot.
Numerals are bound morphemes which express quantity. Unlike English numerals, Chinese numerals can not directly precede a noun. They must be followed by a measure word. 'Three books' in Chinese is shan ben shu. Measures are bound morphemes that follow either numerals or demonstratives.
Numerals and demonstratives must be followed by an appropriate measure before they can modify a noun: vi ge ren 'one MEASURE man,' na liang che 'that MEASURE car.'
Pronouns are deictic and anaphoric words, that is, words
which point to persons or things in the speaking situation and words which corefer with preceding nouns. Syntactically they
generally behave like nouns, but unlike nouns they normally do not admit of modification. The third-person pronoun
distinguishes gender in the written language where separate graphs have artificially been devised for 'he,'I 'she' and ' it.'I In the spoken language all three forms have the same pronunciation.
Adverbs modify verbs and adjectives and are usually
bound: zhi 'only,' I ianlian 'gradually,'I hen 'very,' zui 'most.' Adverbial modifiers derived from verbs are distinguished from adverbs as 'adverbial adjuncts,' and such
forms are frequently followed by the adverbial suffix -de; manmande 'slowly,' derived from man 'slow.'
28
The empty word classes are prepositions such as zai, chao, crei, etc., conjunctions such as gen, he, ye, keshi, etc. , particles such as le, quog, zhe, etc., interjections such as wei, hei, etc., and onomatopoeia such as huala, gulong, dincxdonr, etc. Chinese prepositions originate from verbs. Literally, there is nothing in Chinese that can be exclusively called a preposition in the sense English prepositions are (Chu, 1983). The Chinese preposition can be
conveniently translated into either a preposition or verb depending on whether there is another verb in the same clause. Thus, some grammarians use the term "coverb" to designate this class of words that function somewhere between verbs and prepositions.
(3.3) John gave Mary a book.
John gave Mary a book
John gei le Mary yi ben shu.
(3.4) John wrote a letter to Mary.
John to Mary wrote a letter
John qei Mary xie le yi feng xin.
In (3.3) the crei serves as a verb, but in (3.4) as a preposition. Within this class, the words may exhibit
different degrees of verbal and prepositional properties. That is, some of them are more like prepositions and others are less so. There are certain syntactic criteria that have been
used by Chinese grammarians to determine the membership of verbs and prepositions. Chu (1983) gives some good criteria to
29
determine them: (a) while verbs typically co-occur with aspect markers, prepositions do not; (b) while verbs typically serve
as the center of predication, prepositions do not; and (c) while verbs typically may have their objects omitted, prepositions may not.
Particles are those words that do not possess any tangible meanings in themselves but are required for grammatical (or discoursal) relations. They are called
particles just because there is not any well-established grammatical category in which they may fit. Particles are a group of generally monosyllabic, atonal forms used to show a
number of different grammatical relationships or subjective and modal overtones. There is no equivalent class in English to Chinese particles. Interjections are syntactic forms used to express warnings, call others' attention, or give verbal expression to some emotional state. Phonologically, they reveal certain irregularities in having elements which
are not a part of the normal phonemic inventory, like a voiced [h] and front central vowel [e]. Onomatopoeic word are words that imitate sounds of the natural world. Syntactically, they
may be nominal or adverbial adjuncts: huala 'the sound of rain falling, I gulong 'the sound of thunder or a large vehicle moving along.'
As in English, there is considerable class overlap in Chinese: li can be either a noun 'plow' or a verb 'to plow.'
A high proportion of modern disyllabic verbs can also serve as
30
nouns: Diping 'criticize, criticism,' zuzhi 'organize, organization.'
3.3 Expression of Grammatical Categories
Chinese possesses very little of what is traditionally
known as inflectional morphology. Affixes therefore play only a minor role in the expression of grammatical relationships.
Word order, particles, and prepositions carry most of the burden of showing how the elements of a sentence relate to one another.
3.3.1 Number
Number is obligatorily expressed only for the pronouns. The same plural suffix found in the pronouns -men, can also be optionally employed with nouns referring to human beings. However, the resulting forms differ from English plural nouns
in several ways. They are not used with numerals. They are not obligatory in any context, and they tend to refer to groups of people taken collectively, e.g., haizimen I (a certain group of) children,' laoshimen 'the teachers.' Although number is only rarely indicated morphologically, it is shown in other
ways when necessary. For example, demonstratives are shown to be plural by use of the plural measure -xie: naxie ren 'those people.' A plural subject or object can be indicated by the use of the adverb dou, usually translated 'all,' but often no more than a device for showing plurality:
31
(3.5) The books are all on the table.
The books all on the table
Shu dou zal zhuozi shang.
Number can also be expressed through the use of various quantifiers such as voude, 'some,' vicre 'one, a,' jjgqe 'several,' or hen duo 'many.' Number is for the most part an optional category in Chinese, unlike in English, where it is obligatory.
3.3.2 Definite and Indefinite Reference
Similar to number is the category of definite and
indefinite reference. Chinese lacks articles, but there is little ambiguity. Definite elements may be overtly marked by
modifiers that themselves are inherently definite, such as the demonstratives and possessive pronouns. Nouns which lack such definite modifiers can still be shown to be definite by putting them at the beginning of the sentence, or at least before the verb. Compare the following three sentences:
(3.6) I didn't give him book(s).
I didn't give him book(s)
Wo mei gei ta shu.
(3.7) I didn't give him the book(s).
Book I did not give him
Shu wo mei gei ta.
(3.8) I didn't give him the book(s).
32
I did not BA2 book(s) give him
Wo mei ba shu gei ta.3
In (3.6) the shu 'book' is indefinite; it is after the verb and it has not marked. In (3.7) the definite of the shu 'book' is expressed by sentence initial position. In (3.8) the definite is expressed by the ba construction and the preverb position.
3.3.3 Subordination and Modification
In Chinese, all modifiers precede the elements which they modify. This is a typical SOV language feature (Li & Thompson, 1981). A single suffix, -de, serves to indicate all cases of nominal subordination, including that of possession (Chu, 1987).
(3.9) my television set
my television set
wo de dianshiji
(3.10) this morning's meeting
this morning's meeting jintian shangwu de hui
(3.11) the development of light industry
light industry of development
2 The upcase BA is a Chinese word. 3BA is transitivity enhancing marker for a direct object (Chu, through personal contact). It is generally called "disposal marker" (Wang, 1947) or "pretransitive particle" (Chao, 1968).
33
qinggongye de fazhan
This usage of de is similar to the genitive case of English. But unlike the genitive in English, de is also used to mark modifying clauses:
(3.12) the money that you gave to them
you gave to them money
ni gei tamen de qian
(3.13) people who like to smoke
like to smoke person
xihuan chouyan de ren
3.3.4 Case Relationships
If case is taken in the abstract sense to refer to those
grammatical devices used to show the relationships between nouns and verbs in a sentence, then case relationships in Chinese, as in English, are mostly expressed by means of prepositions. Prepositions along with their objects always occur together with another verbal phrase, which they generally precede.
Certain case relations, however, are expressed by word
order only. Both the agentive subject of a transitive verb and the subject of an intransitive verb are unmarked, and normally they precede the verb; ta zhou le 'he left,' the verb zhou is an intransitive verb and the subject ta precedes zhou; ta chi le fan, 'he ate meal,' the verb chi is a transitive verb, its agentive subject ta precedes it. The indefinite direct object
34
follows the verb, and is likewise unmarked. Definite objects may also follow the verb, especially if they are of certain inherently definite types.
With certain verbs denoting existence, appearance, and disappearance, the logical subject of a verb may come after the verb: xia yu 'rains,' the verb is xia and the logical subject is yu; zoule vige ren 'a person left,' zou is the verb; chu taivang le 'the sun has come out,' chu is the verb.
The dative (indirect object or beneficiary of an action) can be shown either by its position after the verb or by the preposition qei 'to, for':
(3.14) I gave him a book.
I gave him a book
Wo songle ta yi ben shu.
(3.15) I handed the letter to him.
I BA the letter handed to him
Wo ba xin jiao gei ta le.
(3.16) John brought food for Mary.
John for Mary brought food
John gei Mary dai lai le shiwu.
The preposition for the instrumental is either yonyq or na: yong dao gie rou 'cut meat with a knife,' na chi liang 'measure with a rule.' The comitative relationship is
expressed by gen: gen meimei au 'go with (one's) younger sister.'
35
The above prepositions all express what is generally known as grammatical functions (Lyons, 1968). Another set of prepositions is associated with what is known as local functions. Zai is the locative preposition. It is used to indicate where an action takes place: zai ketinQ shuijiao 'sleep in the living room.' This and other local prepositions are frequently associated with nouns followed by a simple or complex localizer. Simple localizers are bound morphemes suffixed to nouns to indicate certain spatial relationships. In the spoken language only -li 'inside' and -shang 'above, on top' occur with any degree of versatility: wuli 'in the room, indoors,' zhuozishang 'on the table.' Complex localizers are formed by suffixing one of several elements to a bound localizer. These suffixes are -tou, -bian, and -mian: bai zai shang tou 'put on top,' ta zai wai bian 'he is outside.'
The ablative relationship is shown by the preposition cong.
(3.17) Mr. Zhang came from Shanghai.
Zhang Mr. from Shanghai came
Zhang xiansheng cong shanghai lai.
Destination is expressed by the preposition dao: dao Beiiing au 'go to Beijing.' Both cong and dao often occur with phrases containing localizers.
3.3.5 Aspect and Voice
Aspect is a term that describes a certain part of an action or event. It is a way of viewing a situation. Tense in language relates the time of the occurrence of the situation
to the time that situation is brought up in speech. Aspect, on the other hand, refers to how the situation itself is being
viewed with respect to its own internal makeup (Li & Thompson, 1981).
Chinese is an aspect language and not a tense one (Chao,
1968; Chu, 1983; Li & Thompson, 1981). This means that Chinese is concerned with telling whether actions are completed or not, or whether they are actually in progress or not. The plotting of action along some sort of time axis, so important in English, is not a feature of Chinese.
Completed action or perfective aspect is shown by the verbal suffix -le:
(3.18) 1 wrote a letter.
I wrote a letter
Wo xiele yi feng xin.
(3.19) Go after you have eaten.
after you have eaten go
Ni chile fan zai qu.
Example (3.18) and (3.19) show that the verbal suffix -le may refer to the future as well as to the past. This demonstrates that this is an aspect and not a tense. Uncompleted action or
37
imperfect aspect except durative aspect is unmarked, that is, there is no suffix or other overt marking associated with it:
(3.20) I read a book yesterday evening.
yesterday evening I read a book
Zuotian wanshang wo kan shu.
In (3.20) the verb is imperfective with 0 marking. However, this does not necessarily mean that the action is not
completed; rather, it indicates that completion is not at issue in this particular sentence. The speaker is merely describing what he did last evening, without reference to whether he completed it or not. The perfective can be described as the marked member of the aspect opposition, in that it specifically indicates whether the action was carried
through to completion or not. The imperfect simply leaves the question open.
An action can be shown to be durative in several ways. The most common way today is to place zai before the verb:
(3.21) They are eating.
They are eating
Tamen zai chi fan.
Another less common durative form has the suffix -zhe 4 after the verb:
(3.22) They are just now holding a meeting.
They just now are holding a meeting
4 This durative is less independent and has a very different usage from zai.
38
Tamen xianzai zheng kaizhe hui.
More commonly the verbal suffix zhe is used to form stative verbs from action verbs: chuan vifu 'puts on clothing,' chuanzhe vifu 'is wearing clothing.' It is treated as a durative aspect marker in semantics and as a subordinating marker in syntax (Chu, 1987).
Chinese verbs in themselves lack any distinction of active and passive. Chi can mean either 'eat' or 'be eaten.' The passive sense of a verb can be made explicit by supplying an agent, expressed by means of one of several prepositions. The most commonly used passive is the bei-sentence. There are other words that can be used in place of bei in the same function. The two following examples can be considered typical passive sentences:
(3.23) He was criticized by everyone.
He BEI everyone criticized
Ta bei dajia pipingle.
(3.24) Xiaoling was beaten by father.
Xiaoling RANG father beaten
Xiaoling rang baba dale.
In Chinese, passives are for the most part restricted either to verbs of an unfavorable meaning or to verbs denoting disposal or separation. The English passive sentence does not as often imply this meaning.
3.3.6 Negation
From a syntactic point of view, negatives behave like adverbs in that they precede and modify verbs. Bu 'not' can be used with any verb except the existential or possessive verb you, which is invariably negated with mei: bugu'doesn't go, won't go, wouldn't go,' buhao 'not good.' Mei is the existential negative: mei(vou) shu 'there is no book, doesn't have a book.' It is also the negation of the perfective and
durative aspects: mei lai 'didn't come, hasn't come yet,' mei(vou) zuozhe 'not (actually) sitting.' Negative commands are formed with buvao (literally 'not want') or bie: ni bie au 'you don't go.'
The semantic effect of the general rule that the negative particle follows the subject and precedes the verb phrase is
that the verb phrase is in the scope of the negative. In other words, the verb phrase, the part of the sentence which follows the negative particle, is what is being denied by the negative particle.
When the sentence contains an adverb, whether the negative precedes the adverb or the adverb precedes the
negative depends entirely on scope. If the adverb has the negative in its scope, then it precedes the negative. If the
negative has the adverb in its scope, then it precedes the adverb.
(3.25) He often doesn't come.
He often doesn't come
Ta chang bu lai.
In this sentence, the adverb precedes the negative. It is not in the negation scope. The negation covers only the verb.
(3.26) He doesn't come often.
He doesn't often come
Ta bu chang lai.
In sentence (3.26), the adverb follows the negative particle bu. The negation scope is on the adverb chang, not on the verb lai.
3.3.7 Modality
The various modalities which in English are expressed by modal auxiliaries are also as a rule expressed by modal auxiliary verbs in Chinese. Most of the Chinese modal auxiliary verbs have more than one function, and there is a certain amount of semantic overlap among them, especially in the case of those verbs expressing possibility, permission, and potentiality.
Volition is most commonly expressed by the verb yao: yao cu 'want to go.' Negative volition may be buvao but more usually it is buxiang or buvuanqvi. Both xianq and vuanyi are also volition auxiliaries: xianq lai 'want to come,' vuanyi kanshu 'want to read, feels like reading.'
Obligation may be expressed in several ways: dei 'must, have to' expresses strong obligation. Its negation is either buyong or bubi. A weaker degree of obligation may be shown
41
with vinQcrai or Vingdanr 'ought, should.' Both of these auxiliaries can be negated in the ordinary way with bu.
The auxiliary hui is used for the expression of possibility.
(3.27) She may come today.
She today may come Ta jintian hui lai.
Hui as an auxiliary also has the common meaning of 'to know how to, to possess the requisite knowledge to': hui kaiche 'knows how to drive a car.' Keyi is the most common verb denoting permission:
(3.28) You can not swim here.
You can not here swim
Ni bu kevi zai zher youyong.
The expression of potentiality is rather complicated. The most general auxiliary used for this notion is neng: neng zoulu 'can walk,' bunencr shuohua 'cannot speak.' With two large classes of complex verbs, another device for indicating potentiality is more common. The two classes of verbs are verb-directional complement and verb-resultive complement compounds. The first of these constructions consists of a verb plus a complement indicating the direction of the action: na shanalai 'bring up (here),' zou jingcqu 'walk in (there).' Verb-resultative complement constructions are composed of a verb plus a complement expressing the result of the action: da si 'beat to death,' chi bao 'eat one's fill.' In both types of
42
constructions potentiality is generally expressed by placing an infixed de between the two parts of the construction: na de shanglai 'can bring up here' and chi de bao 'can eat one's fill.' In the corresponding negative form, de is replaced by bu: na bu shanglai ' cannot bring up (here)'I and chi bu bao 'cannot eat one's fill.'
3.4 The Sentence
3.4.1 Subject and Topic
A sentence in English generally consists of a subject and a predicate. The subject is a noun or its equivalent. The predicate has a verb as its nuclear element. When the verb is transitive, it takes a noun as its direct object. The normal
order of the subject, the verb, and the direct object in English is S(ubject) V(erb) O(bject).
The sentence in Chinese generally also consists of a subject and a predicate. The predicate has a verb as its nuclear element. For a transitive verb, there is a direct object following it. The basic order for these three elements in the Chinese sentence is SVO. However, according to Greenberg's typological scheme (Greenberg, 1963), Chinese is
inconsistent with respect to the features that correlate with VO or OV order (Li & Thompson, 1981). Chinese also possesses many SOy language features, such as SOV sentences do exist and modifier precedes the head noun. Some SOV sentences are commonly used. Thus, Chinese can be seen to have some of the
43
features of an SOy language and some of those of an SVO language, with more of the former than of the latter. Hence, Chinese is regarded as an SVO and SOy mixed language.
At the most elementary level, Chinese sentences can be divided into major and minor types. Major sentences contain both a subject and a predicate: wo au I'I will go.'I A minor sentence contains only a predicate: cia I'(I'Ill) go.'I The frequent omission of pronominal subjects means that minor sentences are more common in Chinese than in English. Major
sentences can be subdivided into simple subject-predicate sentences and composite sentences. A composite sentence is formed of two or more simple sentences (either major or minor) in close combination. If the components are in a coordinate relationship, it is a compound sentence:
(3.29) I'm going and zhangsan's going too.
I'm going and zhangsan's too going
Wo qu, zhangsan ye qu.
A complex sentence results when the component parts are in any of several noncoordinate relationships:
(3.30) If you go, then I won't go.
If you go, then I won't go Yaoshi ni qu, wo jiu bu qu.
One of the most striking features of Chinese sentence structure, and one that sets Chinese apart from many other languages, is that in addition to the grammatical relation of
"subject" and "direct object," the description of Chinese must
44
also include the element topic. In English the basic notion on which a sentence is built is the subject, whereas in Chinese
it is the topic. Because of the importance of topic in the grammar of Chinese, the language is regarded as topicprominent language (Chao, 1968; Chu, 1983; Li & Thompson, 1981). Basically, the topic of a sentence is what the sentence is about. It always comes first in the sentence, and it always refers to something about which the speaker assumes the person listening to the utterance has some knowledge. What
distinguishes topic from subject is that the subject must always have a direct semantic relationship with the verb as the one that performs the action or exists in the state named by the verb, but the topic need not to.
(3.31) Two teachers are not enough.
Two teachers are not enough
Liang ge laoshi bu gou.
The liang ge laoshi 'two teachers,' is a topic in the
sentence. It does not have semantic relation with the verb 90-U
The topic-prominent sentence structure is a significant
typological feature of Chinese in terms of which it can be compared to English. Nearly all English sentences must have a
subject, and the subject is easy to identify in an English sentence, since it typically occurs right before the verb and the verb agrees with it in number:
(3.32) a. That guy has money.
45
b. Those guys have money.
.In Chinese, on the other hand, the concept of subject seems to be less significant, while the concept of topic appears to be quite crucial in explaining the structure of ordinary sentences in the language. The subject is not marked by position, by agreement, or by any case marker; in fact, in
ordinary conversation, the subject may be missing altogether, as in examples (3.33) and (3.34)
(3.33) Yesterday, I read for two hours.
Yesterday I read for two hours
zoutian nianle liang ge zhongtou de shu.
(3.34) It's very cold.
very cold
Hao leng a.
Both the one who did the reading in (3.33) and what it is that is cold in (3.34) are inferred from the context, but do not need to be expressed syntactically by subjects, as they do in English.
3.4.2 Predicate
Looked at from the point of view of the predicate, Chinese sentences may be divided into many types, some of which closely parallel analogous English sentences in their basic structure.
Copular (or nominal) sentences may contain no verb at
all:
(3.35) Today is Friday.
Today Friday
Jintian xinqiwu.
(3.36) My wife is a Cantonese.
My wife Cantonese
Wo airen guangdong ren.
The more usual form of this type of sentence contains the copular verb shi 'is/are.' This verb, unlike its English counterpart, has only a copular function. Although it can be negated directly by bu like other verbs, it takes none of the aspect markers. As in the case of subject and predicate, the relationship between shi and its complement is varied. The most common relationship is that of equality or class membership, as in (3.37) and (3.38), respectively:
(3.37) Zhangsan is his father.
Zhangsan is his father
Zhangsan shi ta de fuqin.
(3.38) All these students are Chinese.
These student all are Chinese
Zhexie xuesheng dou shi zhongguoren.
Sometimes a copular predicate merely explains or comments on the topic of the sentence in a loose manner. For example, if a person is asked, "Why are you at home during the daytime?" he might reply: wo shi wanshanQ de ke 'my class is in the evening.' Sentences of this type are very common, especially in the colloquial language.
47
Existential sentences contain you which is always negated with mei. Unlike shi, it may take some of the aspectual suffixes. An existential sentence may occur with or without a place adjunct: you fanma? 'Is there any rice?'; zher meivou ren 'There is no one here.'
In Chinese, possession is expressed by means of an existential predicate: wo you shu 'I have a book.' John Lyons (1968) has pointed out that sentences (3.39) and (3.40) bear the same transformational relationship to one another as do (3.41) and (3.42):
(3.39) The book is on the table.
The book on the table Shu zai zhuozi shang.
(3.40) There are books on the table.
On the table there are books
Zhuozi shang you shu.
(3.41) The book is mine.
The book is mine
Shu shi wode.
(3.42) I have books.
I have books
Wo you shu.
In (3.39) and (3.41) 'book' is definite; in (3.40) and (3.42) it is indefinite. Structurally, the only real difference between (3.40) and (3.42) is that the word zhuozi is inanimate and wo is animate. It is this and not the difference between
48
two instances of the verb you, one meaning "have" and the other meaning "be," that determines the choice of "there is"
or "have" in the English translation. If we use an unnaturally literal translation like "there is a book by me (or at me)," the parallelism of (3.40) and (3.42) is perfectly clear.
Chinese sentences are divided into declarative,
interrogative, and imperative types. Chao (1968) observes that declarative sentences have truth value, that is, they can be
judged to be true or false. Questions, on the other hand, have information value. They are requests for information. Commands have what Chao calls compliance value. The request, order or
plea contained in such a sentence can be complied with or rej ected.
Declarative sentences can be said to be unmarked.
Questions, on the other hand, are marked, and fall into several distinct categories. Questions asking for specific information contain one of a set of question words. The most common question words are duoshao 'how much, how many,' zenme
'how,'I weishenme 'why,'I shui 'who, whom,'I nar 'where,'I and shenme shihou 'when.'I Questions containing question words have the same word order as the corresponding answer:
(3.43) What do you want?
you want what Ni yao shenme?
(3.44) I want that book.
I want that book
Wo yao na ben shu.
Questions which require a "yes" or "no" answer are of two types. The simpler type is formed by the use of sentence particles, of which ma is the most common:
(3.45) Are you going?
you going Ni qu ma?
The other type is formed by juxtaposing two choices or alternatives for the listener to choose between:
(3.46) Are you going to drink water or tea?
you drink water or drink tea
Ni he shui he cha?
An especially common subtype of the choice question is formed by offering a choice between a verb and its negative:
(3.47) Are you going?
you go or not go
Ni qu bu qu?
An important restriction on this question form is that it cannot be employed if the verb is modified by an adverb.
Imperatives are generally expressed by the verb alone: mi 'go!.'I The presence of a second-person pronoun is somewhat more usual in Chinese than in English: ni aru ' (you) go.'I Imperatives may be made less blunt by the use of the advisative particle ba: nimen lai ba 'you come/why don't you come?' Imperatives are negated with buvao or bie. The latter form is generally considered to be a contraction of buvao. In
50
Chinese it is difficult to separate the imperative (which refers to the second person) from wishes or commands concerning the first and third person. All of them employ the particle ba and are negated with buyao or bie.
(3.48) Let's go.
let's go
Zanmen zou ba.
(3.49) Let's not go.
let's not go
Zanmen bie zou.
(3.50) Let Zhangsan go.
let zhangsan go Zhangsan qu ba.
While the use of ba is usually in sentences like (3.48) and (3.50), which refer to the first and the third person, it may be absent if the injunction is considered urgent:
(3.51) Let's go at once!
let's at once go Zanmen kuai zou!
3.5 Summary
Chinese words traditionally are divided into two categories: full words and empty words. The full words include nouns, verbs, adjectives, numerals, measures, pronouns, and adverbs. They have concrete meaning and can be used alone as sentence parts. The empty words include prepositions,
51
conjunctions, particles, interjections, and onomatopoeia. They have more abstract meanings than full words and are mostly used to show grammatical relationships.
Chinese possesses very little of what is known as inflectional morphology. Affixes therefore play only a minor
role in the expression of grammatical relationships. Word order, particles, and prepositions carry most of the burden of showing how the elements of sentence relate to one another.
The loose semantic relationship between predicate and what is sometimes regarded as subject in Chinese sentences makes it more appropriate to consider Chinese a topicprominent language. The topic-prominent sentence structure is a significant typological feature of Chinese in terms through
which it can be usefully compared to English. It enables Chinese sentences to have no subjects. From the point of view of the predicate, Chinese sentences may be divided into many types, some of which closely parallel analogous English
sentences in their basic structure, but others are totally different. A translation program has to take into
consideration both the similarities and difference in order to map a structure from English to Chinese.
CHAPTER 4
MULTILEVEL RECONSTRUCTION TRANSLATION MODEL
4.1 Introduction
A language is a system of structures that are the bearers of meaning. The structures of language are linguistic units of varying types that are related hierarchically. Words, phrases, and sentences are units of different types and hierarchical
levels (Wirth, 1985). Each linguistic unit has a semantic value conventionally assigned to it that is identifiable independently of context of use. Table 4.1 shows the relationship of the hierarchy.
Table 4.1
Major Units of Linguistic Structure and
Their Typical Semantic Correlates
Type of Linguistic Forms Semantic Values
Sentence/Clause Propositions
Phrases Predicates, arguments
(entitiesconcepts)
Parts of arguments,
Word Parts of predicates,
Quantifiers
Translation from a source language to a target language
is changing the language forms which are words, phrases, clauses and sentences. If we consider an SL on one end of an
52
53
axis and a TL on the other end of the axis, there lies a gap between the SL and TL. The task of translation is to bridge the gap and preserve the SL text meaning as much as possible in the TL text.
For translation from English to Chinese, the gap can be viewed including three levels differences regarding one sentence translation:
(1) Sentential level difference
(2) Phrasal level differences (3) Lexical level differences
The sentential level differences reflect in the word order of a sentence main constituents.
The phrasal level differences mainly reflect in the composition of a phrase and its order.
The lexical level differences include the word form, word specific requirements when used in a sentence.
From the word order viewpoint, English is an SVO
language. Chinese has a mixed order of SVO and SOV. This means that translation from English to Chinese has to map the English SVO word order to either the SVO or the SOV order in Chinese.
On the other hand, English and Chinese have many quite
different features (Chao, 1968; Chu, 1983; Li & Thompson, 1981). They use different linguistic devices to construct sentences. When performing translation from English to
Chinese, these features and devices have to be considered. For
54
example, Chinese does not have article to mark a noun as definiteness or indefiniteness. The definiteness can be expressed through word order. Another example is the passive voice usage. The passive voice is frequently used in English. By and large, it does not have any additional semantic import
in comparison with its active counterpart. But in Chinese, the passive form is almost exclusively used for expressing adverse situations. Thus, when translating an English passive voice sentence, the translation system needs to check whether the
sentence expresses an unfortunate situation. If it does, then a Chinese passive can be used. If it does not, the use of a Chinese passive may distort the original meaning. These kinds of differences exist in a quite broad range. They need
particular attention in a translation system design, if a high-quality translation is to be achieved. The traditional one-step transfer-based translation can hardly handle these wide differences with satisfactory results (Tsutsumi, 1990).
Based upon the language hierarchical features and the differences between English and Chinese as well as the weakness of early English-to-Chinese MT systems, we develop a multilevel reconstruction translation model. This model
extends the consideration for translating an English sentence up to sentential level. It generates Chinese sentences following the hierarchical order of the linguistic structure from sentential level to phrasal level and to lexical level
(Chen & Tou, 1991). It first reconstructs the Chinese sentence
55
structure on the basis of the syntactic and semantic information of the English sentence. After the sentence
structure is determined, the phrases will be reconstructed into Chinese expression. This will be the LCF for English-toChinese translation. The final stage is the word reconstruction. This completes the Chinese translation of an English sentence.
The multilevel reconstruction translation model is designed for achieving high translation quality for Englishto-Chinese translation. The main idea is to bridge the
linguistic gap between English and Chinese through sentential, phrasal, and lexical reconstruction. In this hierarchical processing, the system can concentrate on one level at a time
and obtain optimized result at each level of the language hierarchy. Thus, the whole sentence translation will be optimized in the sense of the final result.
In this chapter, the multilevel reconstruction
translation model is described in detail. Section 4.2 gives an outline of the model. Section 4.3 describes the sentential level reconstruction and how the Chinese sentence structure can be determined. It is the most basic and most important part of the model, since reconstruction in all the other levels is based on it. Section 4.4 describes how Chinese phrases are reconstructed from their English counterparts. Section 4.5 describes the lowest level in the hierarchy--the
56
word level. Finally, in section 4.6, the section is summarized and the model is discussed.
4.2 The Overall Structure of the Model
The structure of the multilevel reconstruction model is shown in Fig. 4.1. The components of the structure are described briefly below, and their details are explained in the following sections.
4.2.1 Sentential Reconstruction Module
After an English sentence is analyzed, the sentential reconstruction module will establish the sentence structure and word order for the output Chinese sentence. A sentence structure is the foundation of a sentence. English and Chinese may have different ways to express an entity or idea. The sentential reconstruction establishes the most pertinent sentence structures for Chinese translation. The
reconstruction is based on English sentence syntactic and semantic information, including sentence pattern, voice, verb, and the characteristics of postverb elements.
4.2.2 Phrasal Reconstruction Module
The phrasal reconstruction module takes sentence
constituents as its working elements. It will resolve the phrasal level differences between English and Chinese. The phrasal level differences between English and Chinese reflect
Lexical Level Reconstruction
Generated Chinese Sentence
Analysis English Sentence
Sentential Level Reconstruction
Phrasal Level Reconstruction
Fig. 4.1 The Multilevel Reconstruction Translation Model
Structure
58
in two aspects: multiple-word phrase construction and multiple adverbial phrases ordering. For multiple-word phrase, since both English and Chinese phrase construction follows certain rules, their differences can be resolved through phrasal
transformational rules which transform an English phrase into a Chinese phrasal expression. To order multiple adverbial phrases, phrases with different functions can be ordered according to their function role. Phrases with the same function can be ordered through a larger concept first
principle. After the phrasal reconstruction, the sentence becomes LCF.
4.2.3 Lexical Reconstruction Module
The lexical module substitutes Chinese words for English words. Chinese words are chosen from a bilingual dictionary. The main principles for choosing a word are
(1) domain restriction;
(2) semantic marker attached to a word in the semantic
analysis stage; and
(3) context information and pragmatic information
recognized in the English sentence analysis stage.
The lexical module also handles the tense, aspect problems, and other word-specific problems. They include adding measure word to noun, and adding result complex ment to verb and direction complement to preposition etc.
59
4.3 Sentential Level Reconstruction
The sentence is the largest segment of a language
about which specific descriptive statements of a linguistic construction can be made. It is independent in the sense that
it does not stand in grammatical construction with other segments (although it may logically or psychologically be more or less tied to the preceding or following sentences). It is complete in the sense that it forms a structured grammatical whole and a semantically complete entity. Besides, sentences may be described in terms of their type, form, and structure.
The sentence structure lays the foundation for the translation for the whole Chinese sentence. At the sentential reconstruction level, the structure of a Chinese sentence is
reconstructed on the basis of the sentence pattern, verb, postverbal elements, voice, etc., of its English counterpart.
The sentential level reconstruction is divided into four major steps: (1) syntactic-rule-based reconstruction, (2) sentencepattern-based reconstruction, (3) structural feature-based
reconstruction, and (4) complex sentence order reconstruction.
4.3.1 Syntactic-Rule-Based Reconstruction
There are six fundamental sentence structures in English (Stockwell, 1977). All other English sentences can be obtained from combinations of these six structures. The six sentence structures are
(1) NP V (ADV)
John arrived (on Friday).
(2) NP V NP (ADV)
John bought the book (on his way home).
(3) NP V NP NP (ADV)
John threw Mary the ball (angrily).
(4) NP BE NP
John is a lawyer.
(5) NP BE ADJ
John is intelligent.
(6) NP BE PP
The book is on the desk.
These six structures can be transformed into six structures in Chinese accordingly without considering any semantic domain influence.
(1) NP V (ADV) --> NP (ADV) V
John arrived (on Friday).
John (on Friday) arrived
John (xingqiwu) lai le.
(2) NP V NP (ADV) --> NP (ADV) V NP
John bought the book (on his way home).
John (on his way home) bought the book
John (zai hui jia de lushang) mai le na ben shu.
(3) NP V NP NP (ADV) --> NP (ADV) V NP NP
John threw Mary the ball (angrily).
John (angrily) threw Mary the ball
61
John shengqide reng gei Mary na ge qiu.
(4) NPEENP --> NPBE NP
John is a lawyer.
John is a lawyer
John shi yi ge lushi.
(5) NP BE ADV --> NP ADJ
John is intelligent.
John intelligence
John chongming.
(6) NP BE PP-->NP PP
The book is on the desk.
book on desk top
Shu zai zuozi shang.
These syntactic transformational rules can be effectively used f or reconstructing a Chinese sentence structure only when no other factors will affect the sentence structure. In order
to achieve high quality and naturalness, the other influential factors have to be considered first. Since the determination of a Chinese sentence structure counts for more factors than just the syntactic one, other influential factors should be
considered first before using these transformational rules to reconstruct a Chinese sentence structure. The syntactic transformational rules should be the last resort in the translation process.
62
4.3.2 Pattern-Based Reconstruction
In translating from English to Chinese, sentence patterns play an important role. Sentences, according to their structure, can be divided into different patterns. Many English sentence patterns traditionally have fixed
translations. They must be translated into their corresponding Chinese sentence patterns. For these kinds of English
sentences, the best way to translate them is to follow the conventional patterns. on the other hand, the meaning of some English sentences are not clearly reflected in their surface form. It would be difficult for the syntactic-based translation to convey correct meaning of these sentences to TL. One solution is to identify these sentences as patterns
and store this sentence pattern information in a database. Thus, whenever an English sentence is recognized as conforming to a pattern, the corresponding Chinese pattern structure can be retrieved from the database and used to construct the Chinese sentence. For example,
(4.1) This is too good to be true
is a common English sentence pattern. It can be identified as the < . too ADJ/ADV to. > pattern. Here, the sentential meaning is not simply the total of its constituents. If one does not know the pattern feature, he simply can not get the correct meaning of this sentence. Using the < . tai ADJ/ADV bu keneng . > Chinese sentence pattern structure, the corresponding Chinese sentence translation turns out to be:
63
(4.2) Zhe tai hao le bu keneng shi zheng de.
this too good LE not capable be true DE
instead of a direct translation, which would be wrong:
(4.3) zhe shi tai hao shi zheng de.
this be too good be true DE For another example,
(4.4) The disk drive ran too long a time to be working
properly.
In this sentence, the main verb is an action verb instead of a copular, but the sentence is still considered the <.too ADJ/ADV to.> pattern due to the presence of <.too ADV to .> segment. Thus using the <.tai ADJ/ADV bu keneng.> pattern, the Chinese sentence becomes
(4.5) Cipan qudongqi yunzhuan shijian tai chang le bu
disk drive run time too long LE not
keneng gongzhuo zhengchang.
capable work normal
A second pattern example may be illustrated with the following sentence:
(4.6) He did not come back until midnight.
It is considered to be of the <.not.until.> pattern. Its corresponding Chinese pattern is <.dao.cai.>. If this sentence is not treated as a special pattern, the adverbial phrase "until. midnight" would appear in the beginning of the sentence or between the subject and the verb in Chinese according to Chinese grammar. The Chinese translation would become
(4.7) ta zhi dao banye hai mei hui lai.
he till reach half night yet not back come
64
This Chinese sentence does not correctly convey the
English sentence meaning. It means that up to midnight, he has not come back yet. Furthermore, whether he came back shortly after midnight is unknown. It is obviously not the original meaning. using the pattern rule, the translation becomes
(4.8) ta zhi dao ban ye cai hui lai.
he till reach half night just back come
This carries the correct interpretation of the English sentence.
4.3.2.1 Cleft sentence pattern
For the English cleft sentence, Chinese has a quite similar sentence pattern
. This pattern is very close to the English cleft sentence in both meaning and form. For instance,
(4.9) Becker beat Lendl in the Wimbledon final.
Sentence (4.9) above can be divided into three distinct parts
and each can be emphasized separately using cleft sentence pattern.
(4.10) a. It was Becker that beat Lendl in the
Wimbledon final.
b. It was Lendl that Becker beat in the
Wimbledon final
c. It was in the Wimbledon final that Becker
beat Lendl.
The normal unmarked sentence (4.9) can be translated into Chinese as
(4.11) Becker zai Wimbledon juesai dabai le Lendl.
65
Becker in Wimbledon final-match beat lose LE Lendl Using the pattern, the cleft-sentence translations are, respectively,
(4.12) a. Shi Becker zai Wimbledon juesai dabai le
be Becker in Wimbledon final-match beat lose LE
Lendl (de).
Lendl DE
b. Shi Lendl Becker zai Wimbledon juesai dabai
be Lendl Becker in Wimbledon final-match beat
le (de).
lose LE DE
c. Shi zai Wimbledon juesai Becker dabai le
be in Wimbledon final-match Becker beat lose LE
Lendl (de).
Lendl DE
The versions (4.12a), (4.12b), and (4.12c) are the appropriate translations for sentences (4.10a), (4.10b), and (4.10c). The emphasized part immediately follows the shi. The uncleft sentence structure is basically kept. If the pattern is not used, the translation of (4.10) would be like
(4.13) a. *ta shi Becker ta zai Wimbledon juesai da
he be Becker he in Wimbledon final-match
bai le Lendl.
beat LE Lendl.
or using a rule to delete the it was
b. ?Becker ta zai Wimbledon juesai dabai le
Becker he in Wimbledon final-match beat LE
Lendl.
Lendl
they are either unacceptable or unnatural.
4.3.2.2 Existential pattern
The English existential sentence is marked by an initial dummy there as its subject. It is a grammatically distinct construction for expressing existential propositions. Existence in Chinese is expressed by the existential you, the same form as the possessive verb.
The general pattern for translating English existential there be is
For example,
(4.14) There are five airplanes in the sky
is an existential sentence. If translated directly without using the pattern rule, it would become
(4.15) *nar shi wu jia feiji zai tianshang.
there be five M airplanes in sky Using the pattern rule, it becomes
(4.16) Tianshang you wu jia feiji.
sky has five M airplane
which is the correct translation. In this case, there is a locative complement and a noun phrase in the sentence. The locative occupies the subject (topic) position, and the noun phrase occupies the object position. Sentence (4.14) can be considered a prototypical case. There is a considerable
variety of clauses containing the pronominal there as the
1 If the nominal is considered as a topic, the pattern can be . We choose the form here.
67
subject. They can also use the pattern for reconstruction.
The English sentence structure
there be + NP + predicative.
The predicative is normally an adjective, such as
present, absent, available, eligible, etc., or a participle clause complement.
(4.17) There is no doctor present.
In this case, there will be no locative complement to be moved into the subject (topic) position in the Chinese sentence. The subject slot will be left empty.
place empty;
entity NP + predicative;
and the translation of sentence (4.17) is
(4.18) Mei you yisheng zai chang.
not YOU doctor in place
When the predicative part is a participle clause complement, the rule will be the same.
place empty;
entity NP + participle clause; For example,
(4.19) a. There were children playing on the road.
b. There were children singing.
Their corresponding pattern translations are, respectively,
68
(4.20) a. you haizi zai lu shang wan2
YOU child on road SHANG play
b. You haizi zai changge.
YOU child ZAI sing song
For the above examples we can see that the English there be sentence is always translatable into the Chinese pattern, in spite of its variations in form.
When an intransitive verb other than be is used in the there sentence, it's no longer simply an existential one. The verb has to be translated. The verbs usually are appear, arrive, arise, come, exist, follow, occur, etc. For example,
(4.21) There arrived many students.
As the verb arrive has its specific meaning beyond existence in the sentence, translation must specify this meaning in addition to the existence expressed by you. Thus, the Chinese pattern becomes
< YOU + NP + intransitive verb + .>
(4.22) a. There lives a young man on the second floor.
b. you yi ge nianqing ren zhu zai er lou.
YOU one M young person live on two floor
c. There appears a big alligator.
d. you yi tiao da eyu chuxian le.
YOU one M big alligator appear LE
4.3.2.3 Extraposition
Extraposition in English describes the syntactic process which characteristically moves a subordinate clause in the 2 Here, "on the road" is assumed modifying "playing"; otherwise the locative phrase should be in the sentence initial position.
69
subject position to the right (i.e., to a position beyond the main predicate) and inserts a dummy subject in the sentence initial position. The sentence (4.23b) is derived from the one in (4.23a).
(4.23) a. (For you) to change your mind now would be a
mistake.
b. It would be a mistake (for you) to change your
mind now.
In Chinese there is no corresponding sentence pattern to English extraposition sentence. If an extraposed English sentence is translated without using the pattern rule, the result would be an ungrammatical Chinese sentence. For sentence (4.23b) the direct Chinese translation is
(4.24) *ta shi yi ge cuowu (dui ni) xianzai gaibian
he be one M mistake to you now change
nide zhuyi.
your mind
To process the English extraposed sentence, the Chinese translation has to use the nonextraposed form. Thus for both (4.23a) and (4.23b), the Chinese sentence translation would be
(4.25) (Dui ni) xianzai gaibian nide zhuyi shi cuowu de.
to you now change your mind be mistake DE Here, due to the lack of a particular linguistic device, the Chinese translation loses a stylistic variation. 4.3.3 Structural Feature-Based Reconstruction
For some English sentences, the linguistic devices they used do not have direct equivalent in Chinese. To realize the
70
same linguistic function, the Chinese sentence structure may
be different from that of the English sentences. In this case, the corresponding Chinese sentence structure may be determined by many factors. These factors include sentence verb, postverbal elements, definiteness and voice, etc.
4.3.3.1 Verb and postverbal elements
The verb of the predicate is of fundamental importance in a Chinese sentence; everything else in the sentence ultimately depends on 'it (Henne, Rongen, & Hansen, 1977). In many sentences the subject can be omitted, but the predicate is required. Except for cases where a contrastive is expressed on a nominal, the predicate is a nonomissible part of the sentence and the center of the sentence. In the same way, in a verbal predicate the verb is the center; the verb forms a
nonomissible part of it. Thus, when the main verb in a Chinese sentence is chosen and together with other features, the sentence structure can be determined in many cases.
Word order is a case in question. While English is an SVO language, Chinese has a mixed order of SVO and SOV. This means that translation from English to Chinese has to map the English SVO word order to either the SVO or the SOy order in
Chinese. The mapping depends largely on the verb and the features of postverbal elements.
Generally speaking, when a verb is chosen, the sentence
structure could be determined. However, for certain verbs, the postverb elements also affect the sentence structure. For
71
example, when a transitive verb is chosen, the position of the syntactic object (i.e., pre-verb or post-verb) is determined by whether the object noun is a topic at the discourse level,
whether it is definite in the speaker's assumption, and/or whether it is an agent at the semantic/functional level (Chu, 1983).
Chinese ba-construction is a good example. It is a
commonly used sentence construction in Chinese. omitting it in the translation may make the translation unnatural. The Chinese ba-construction is a unique language form, also known
as "disposal sentence" (Wang, 1947). It means that an agent does something that in some way affects a patient. Its basic form is
Subject BA object verb complement
The BA in the sentence is treated as a coverb or
preposition, and the complement can be in the form of a resultative or directional verb ending, a cognate object, a phrase, or simply a le. The following examples illustrate how the ba-construction is used.
(4.26) John put the book on the table.
If the sentence is processed through the normal translation procedure, the Chinese sentence would be
(4.27) John fang le zhe ben shu zai zhuozi shang.
John put LE the M book on table SHANG
It is grammatically correct, but a bit awkward. The more natural way of saying this in Chinese should be
(4.28) John ba shu fang zai zhuozi shang le.
72
John BA book put on table SHANG LE where the ba-construction is used.
Syntactically speaking, there are three requirements for using the ba-construction (Chu, 1983):
(1) the proposed patient noun must be either definite
or specific,
(2) the verb must be an action verb, and
(3) there must be something after the verb as a
"complement" to indicate the effect of the action
on the patient.
When any English sentence meets these requirements, the ba-construction should be used and the translation will be more natural3. For example,
(4.29) Bill loaded the hay on the truck.
The verb load is an action verb. The patient noun phrase the hay is definite, and the hay is affected in its position. Thus, the ba-construction should be used to translate this sentence:
(4.30) Bill ba gancao zhuang zai kache shang le.
Bill BA hay load on truck top LE
If the ba-construction is not used, the translation would be
(4.31) Bill zhuang le gancao zai kache shang.
Bill load LE hay on truck top It does not sound as natural as (4.30). For another sentence,
3 The consideration is in sentence level. Discourse level factors are not considered in current stage.
73
(4.32) Don't take my typewriter away.
This is an imperative sentence. The verb and the object meet the three requirements. Using the ba-construction, the translation becomes
(4.33) Bie ba wode daziji na zou.
not BA my typewriter take away
It carries the English meaning quite well and is quite natural in Chinese.
The following example illustrates a different situation:
(4.34) John put a box on the table.
The object box in sentence (4.34) is marked by an indefinite article and is indefinite in meaning. It does not fulfill one of the three requirement for the ba-construction. Thus, the
ba-construction should not be used, and its translation should be
(4.35) John fang le yi ge hezi zai zhuozi shang.
John put LE one M box on table SHANG
4.3.3.2 Definiteness
English and Chinese may use different devices to express
certain notions. For instance, English uses an article to express the definiteness of a noun. The definite article the
marks a noun as definite, and the indefinite article ajn) marks a noun as indefinite. In Chinese, there is nothing like
articles for such functions. The definiteness of a noun is expressed either through overtly marked modifiers that themselves are inherently definite or by the syntactic position of the noun relative to the verb. Roughly, preverbal
74
nouns are considered definite and postverbal nouns are
considered nondef inite if they are not otherwise marked. This device therefore greatly affects the word order of a sentence. For example,
(4.36) a. John bought a book.
b. John bought the book.
Sentences (4.36a) and (4.36b) are structurally the same except that in sentence (4.36a) the word book is modified by an indefinite article a, while in (4.36b) the same word is modified by a definite article the. It reflects the speaker's assumption about the hearer's perception of the identity of the entity "book." For sentence (4.36a), the notion "indefiniteness" can be expressed in one way only:
(4.37) John mai le yi ben shu.
John buy LE one M book
For (4.36b), there are several ways of expressing definiteness in Chinese:
(4.38) a. John mai le na ben shu.
John buy Le that M book
b. John ba shu mai le.
John BA book buy Le
c. Shu John mai le.
book John buy Le
Translation (4.38a) uses a demonstrative word na to
express the notion of definiteness. For the same notion, translation (4.38b) uses the ba-construction, and translation
(4.38c) uses the preverbal position. The problem of which device is more appropriate depends more on discourse context
75
and pragmatics, which are beyond the scope of consideration of this dissertation.
In our model, the ba-construction will be tried first. If the sentence can not meet the ba-construction requirements, the definiteness will be expressed overtly by translating the definite article the into Chinese zhe or na. The sentential structure determination will be subjected to other consideration.
4.3.3.3 The passive sentence
The normal English passive voice sentence takes the following form:
Patient be verb-en (by agent)
It is a device to put the patient in subject position, usually because there is no agent mentioned. From the semantic point
of view, it is the same as its active counterpart. The passive voice, however, is frequently used in English, especially in
writings of science and technology. The translation of the English passive voice to Chinese poses a special problem (Li & Chang, 1988), since the passive in Chinese is used with a far lower frequency and it also possesses some special meaning.
In Chinese, the passive form is typically expressed by the bei-construction. Its form is
Patient BEI (Agent) V complement.
The BEI is treated as a coverb or preposition and the complement is in the form of a resultative or directional verb
76
ending, a cognate object, a phrase, or simply a l1e. The agent, like in English, is optional.
Currently almost all MT systems use the Chinese beiconstruction as the equivalent to the English passive voice, and its coverb bei as the equivalent to the English
preposition )2y. As a matter of fact, the be i-construction does not mean the same as its English counterpart- -the be-passive.
Linguists have used the term "pejorative" to explain the meaning of the Chinese bei construction. That is, the bei construction means something unfortunate (Chao, 1968; Chu, 1983; Li & Thompson, 1981). By using the bei construction, the speaker actually means that the event is unfortunate--to the patient or to the speaker. Thus, when translating an English
passive sentence without any unfortunate implication, the bei construction should not be used. Instead, it should be transformed into an active sentence with the patient as the subject/topic. For example,
(4.39) This book was translated into Chinese by Dr. Li. Its passive translation would be
(4.40) Zhe ben shu bei Li boshi fanyi cheng zhongwen le.
this M book BEI Li doctor translate into Chines LE
Superficially, (4.40) is a perfectly acceptable translation for (4.39). However, the Chinese sentence has the added meaning that it is unfortunate that the event happened the way it did or that it happened at all.
Turning to the fact that the patient occurs in the
sentence initial position, this is a device to mark the
77
patient as the topic of the sentence. The English passive serves this function without necessarily indicating the
unfortunate nature of the happening. It is therefore more appropriate to translate English passive into Chinese with a structure where the patient is made the topic. Whether there should be the coverb bei to express the unfortunate
implication depends on the interpretation of the original English sentence. For some verbs, the unfortunate interpretation is an extra-linguistic feature. For example,
(4.41) a. The Bible has been translated into Chinese by
Dr. Li.
b. Shengjing Li Boshi fanyi cheng zhongwen Le.
bible Li doctor translate into Chinese LE
The verb translate does not necessarily imply an unfortunate
happening and thus normally should not be translated with bei construction. Other verbs may have an inherent unfortunate reading, for example,
(4.42) a. The whole city was burned down by the enemy.
b. Zheng ge cheng dou bei diren shack guang le.
whole M city total BEI enemy burn none-left LE In sentence (4.42a) the verb phrase burn down almost always
carries an unfortunate meaning, and thus the bei construction is used. The adverse semantics can be considered being primarily coded in verbs or verb phrases.
To determine whether to use Chinese bei passive for translating the English passive sentence, the adverse semantics has to be considered. If the English sentence
implies an unfortunate meaning, then the bei construction can
78
be used. If there is no unfortunate meaning involved in the
English sentence, the English passive structure should be translated as an active sentence.
4.3.3.4 Comparison
The comparison structure in Chinese is quite different from that in English. In English, the unit being compared can be any part of a sentence. In Chinese, what is being compared must be the topics of the sentence (Chu, 1983). In the following, there is a comparison structure in Chinese
corresponding to the one in (4.43a) , but none corresponding to the one in (4.43b).
(4.43) a. He speaks Spanish better than I do.
b. He speaks Spanish better than he does French.
The comparative structure involves two items compared along one dimension. There are three types of relationships these two items can have to each other. One can be (1) more
than the other (superiority) (2) less than the other (inferiority); or (3) the same as the other (equality). The basic pattern for all three types in Chinese is
A comparison-word B (adverb) dimension.
The comparison structures for the three types, however, differ:
(a) A bi B ADV/ADJ (compare with)
(b) A mei(you) B (neme) ADJ/ADV (not as . as)
(c) A gen B yiyang ADJ/ADV (as . as)
For example,
79
(4.44) a. He is taller than his father.
b. This box is smaller than that one.
These two sentences are of the superiority relationship type, and thus will be translatable into type (a):
(4.45) a. Ta bi tade fuqin gao.
he comparison his father tall
b. Zhe ge hezi bi na ge xiao.
the M box comparison that M small For an inferiority comparison sentence,
(4.46) He is not as tall as his father. type (b) comparison structure will be used:
(4.47) Ta mei(you) tade fuqin (neme) gao.
he not YOU his father that tall For the equality comparison,
(4.48) He is the same height as his father is. type (c) will be used:
(4.49) Ta gen tade fuqin yiyang gao.
he GEN his father same tall
These three comparison structures can only be used for topic or subject comparison. However, English comparative sentences can compare direct objects:
(4.50) He speaks Spanish better than he does French.
where the compared items are direct object Spanish and French. For this type of sentence, the objects must be transformed into topic in Chinese before they can be compared. So the Chinese equivalent to (4.50) is
(4.51) Ta shuo de xibanyayu bi (ta shuo de) fayu hao.
he speak DE Spanish comparison he speak DE French
good
.80
Literally, (4.51) means "The Spanish he speaks is better than the French (he speaks)."
4.3.4 Complex Sentence Reconstruction
In English, complex sentences are linked by conjunctions. The order of the main clause and the subordinate clause (except in the case of a relative clause) is usually not restricted. The main clause may either precede or follow the subordinate clause. The main-subordinate relationship is
marked not by the order of the clauses, but by the conjunction word. In Chinese, the complex sentence order is restricted by
two semantic principles (Lu, 1990; Tai, 1985): (1) temporal principle, and (2) logical principle. Any grammatical mainsubordinate complex sentence has to follow the two semantic
principles, if it is not otherwise marked. In other words, the clause of a Chinese complex sentence can not be placed at a random location.
The temporal principle:
Chinese clauses follow the temporal sequence. The clause which describes the event happening first should precede the clause which describes the event following.
For example, English can have the following two sentences which describe two identical chronological events in different syntactic orders.
(4.54) a. After I had my dinner, I went out for a walk.
b. I went out for a walk after I had my dinner.
81
Semantically these two sentences are not any different. The main-subordinate arrangement is a stylistic variation or pragmatic consideration. However, in Chinese, the mainsubordinate order'has to follow the temporal principle. The
event that happens first must precede the event which happens later. Thus the description of the event in (4.54) can have
only one syntactic ordering in a complex sentence in Chinese:
(4.55) wo chi wan fan hou chu qu sanbu le.
I eat dinner later go out walk LE
The main-subordinate clause expression follows the natural temporal sequence of the events. The "eating" action happened before the "going out for a walk."
If the Chinese translation of (4.54b) kept the same mainsubordinate clause order,
(4.56) *Wo chu qu sanbu le chi wan fan hou.
I go out walk LE eat dinner later
it would be ungrammatical, as it violates the temporal principle.
The logical principle:
Chinese clause order should follow the order of logical
sequence. That is, the clause which describes the premise, cause, reason, condition, etc., should precede the clause which describes the effect, result, action.
Logical relations include such logical connections among events as cause, reason, goal, purpose, condition, etc. In English, such clauses may have flexible order as long as a morphological marker is present. But in Chinese, the logical
82
relation between the members of a main-subordinate clause pair follows strictly logical order: they have to be cause-effect,
problem-solution, reason-result, condition-action, etc. For example, the following English sentences are all acceptable:
(4.57) a. I will leave if that's the case.
b. If that's the case I will leave.
c. We couldn't start the car as the gas pump was
broken.
d. As the gas pump was broken we couldn't start
the car.
But in Chinese, only orders of the b and d version are considered appropriate. The translations for sentence (4.57a) and (4.57b) are
(4.58) Ruguo shi na yang de hua wo jiu de zhou le.
if be the way DE talk I must go LE
The translation for sentence (4.57c) and (4.57d) will be
(4.59) yinwei youbang huai le women bu neng fadong
because gas-pump broken LE we not able start
chezi.
car
4.4 Phrasal Level Reconstruction
The phrasal level reconstruction takes place after the sentential level reconstruction has established the Chinese sentence structure. Its task is to resolve the phrasal level
differences between English and Chinese, and the result is the LCF for the translation.
4.4.1 Adverbial Position
During the sentence structure reconstruction, the Chinese sentence structure is established and most sentence
constituents will be placed in their position according to Chinese grammatical requirements. However, optional adverbials are not handled in the sentential level reconstruction
process. They are still in the position which f its English grammar instead of Chinese grammar. They need to be moved to the correct position according to Chinese grammar.
The typical adverbial position in English is at the end
of a sentence. However, an 'adverbial can also occur at the sentential initial position. And some adverbial can take a nonsentential initial preverbal position, such as manner
adverbial. In Chinese, the typical position for adverbials is at the beginning of a sentence or between the subject and the verb. Polysyllabic adverbs may occur in either position with
little meaning difference, while monosyllabic adverbs can occur only between the subject and the verb (Chu, 1983). There must be a mapping process to transfer the three different English adverbial positions to two Chinese adverbial positions. The rules for the transfer are
(1) If the adverbials are at the end of a sentence, they
will be transferred to the position between subject
and verb.
84
(2) If the adverbials are at the beginning of a sentence,
the sentential initial position will be kept in the
Chinese sentence.
(3) For the noninitial preverbal adverbials, their
preverbal position will be kept in the Chinese
sentence.
For the sentence (4.29) above, if it has a time adverbial phrase "at three o'clock yesterday afternoon," the sentence will be
(4.60) Bill loaded the hay on the truck at three o'clock
yesterday afternoon.
After the sentential reconstruction, the Chinese baconstruction is established. The sentence will become
(4.61) Bill BA the hay loaded on the truck at three
o'clock yesterday afternoon.
The agent, patient, and locative constituents are all arranged into their appropriate positions except the time phrase "at three o'clock yesterday afternoon." It is still at the end of the sentence. This position is unacceptable in Chinese.
According to the adverbial transfer rule, it will be moved to the position between the subject and the verb. In a
ba-construction, it should appear between the subject and the particle ba. Putting the adverbial phrase into its proper position, the sentence becomes
85
(4.62) Bill at three o'clock yesterday afternoon BA the
hay loaded on the truck. More examples are list below:
(4.63) a. They worked very hard.
b. Tamen feichang nuli gongzhuo.
they very hard work
c. Fortunately, he was not hit by the car.
d. Xincrhao, ta mei bei chezi zhuang dao.
Fortunately he not BEI car hit on
4.4.2 Multiple Word Phrase Reconstruction
A multiple-word-phrase can have different word order between English and Chinese. In the phrasal level
reconstruction, the phrases which have different word order between English and Chinese will be reconstructed into Chinese word order.
4.4.2.1 Noun phrase
A noun phrase in English can have modifiers either preceding or following the head noun. In Chinese, all
modifiers must precede the head noun. To translate English noun phrases, one task is to transfer the English trailing modifiers to a preceding position. For example, the following English noun phrase
(4.64) a personal computer with a color monitor
has two modifiers to modify the head noun computer. The adjective word small precedes the head noun computer. The preposition phrase with a color monitor follows the head noun
86
computer. The English grammar rule for generating this noun phrase is
NP --> DET ADJ N PP.
The reconstruction rules for obtaining a corresponding Chinese noun phrase is
DET ADJ N PP --> DET PP ADJ N.
It reverses the position of the prepositional phrase and the head noun. The expression in Chinese is
(4.65) a. a with a color monitor personal computer
b. yi tai dai caishe xianshiqi de geren j ishuanji
a M with color monitor DE personal computer
For general multiple-word noun phrases, their English-toChinese reconstruction rules are summarized in the following:4
English: Chinese:
DET N --> DET N
DET (ADJ)*5 N -DET (ADJ)* N
DET (ADJ)* N PP--> DET PP (ADJ)* N
DET ADV (ADJ)* N --> DET ADV (ADJ)* N
DET ADV (ADJ)* N PP --> DET PP ADV (ADJ)* N
NUM N --> NUM N
DET NUM (ADJ)* N --> DET NUM (ADJ)* N
DET NUM (ADJ)* N PP DET PP NUM (ADJ)* N
DET NUM ADV (ADJ)* N --> DET NUM ADV (ADJ)* N
DET NUM ADV (ADJ)* N PP --> DET PP NUM ADV (ADJ)* N
4 This list includes only multiple-word phrases. Single word phrases are not included.
* means the element can be repeated.
6 S stands for relative clause here.
87
(N) * N (N) * N
NP S 6 S NP
4.4.2.2 Other phrases
For other kind of phrases such as adjective phrase, adverbial phrase, and prepositional phrase, their generation rules are almost the same between English and Chinese. Thus, these phrases do not need specific processing to obtain correct Chinese word order.
The adjective phrases generation rule for English and Chinese are
English Chinese
Adverb Adjective Adverb Adjective
The adverbial phrases generation rule for English and Chinese are
English Chinese
(Adverb)* --> (Adverb)*
The propositional phrases generation rule for English and Chinese are
English Chinese
Prep NP --> Prep NP
4.4.3 Multiple Adverbial Phrases Ordering
When several adverbials occur in a sentence, the order among the adverbials follows certain rules. The English order is different from the Chinese order. Thus, whenever multiple
88
adverbials occur, they need to be reordered according to Chinese convention.
4.4.3.1 Adverbials with different function
When several adverbial phrases with different functions
occur in a sentence, the ordering of them can be based on their functions. For instance, when manner adverb, time adverb, and place adverb occur together in one sentence, the
usual English order is manner adverb first, then the place adverb. The time adverbial will be the last (Quirk, Greenbaum, Leech & Svartvik, 1979).
(4.66) She ate quietly in her room last night.
In Chinese, the corresponding order for these adverbs will be the opposite. The time adverb is first, then the place adverb. The manner adverb will be the last. Sentence (4.66) in Chinese word order will be
(4.67) a. She last night in her room quietly ate.
b. Ta zuotian wanshang zai tade fangjianli
she yesterday night in her room
j ingj ingde chifan.
quietly eat-meal
4.4.3.2 Adverbials with same function
When several adverbial phrases with the same function occur in one sentence, the order of these adverbials follows
certain rules. The rules are different between English and Chinese. This fact is believed by some as reflecting
difference in thought and cultural processes (Tou, 1988, Bi 1989). In English, things are usually expressed from small to
89
large, from specific to general, from individual to group. In
Chinese, things are usually expressed in the opposite way, from large to small, from general to specific, and from group
to individual (Bi, 1989, Lu, 1990). Generally, in the phrasal level the Chinese word order follows the larger concept first principle.
Larger concept first principle:
The word or phrase which expresses a larger concept than the other words or phrases should precede the other words or phrases in a sentence.
4.4.3.2.1 Time Expression
For time expressions, the English word order is clock time, calendar time, from the smaller unit to the larger unit, such as minute, hour, day, month and year. The order for Chinese expression is the opposite, from the larger unit to
the smaller unit, i.e., from year, month, day, hour, to minute. It observes the larger concept first principle. For example, the following time expression is the correct English word order:
7:30 p.m., Wednesday, January 30, 1990.
The minute and hour unit, which is smaller than the day unit,
precedes the day unit. The year unit, which is the largest measurement in this expression, comes last.
For the same expression the Chinese word order is
1990 January 30 Wednesday 7:30 p.m.
1990 nian yiyue 30 ri xingqi san xiawu 7 shi 30 fen.
1990 year January 30 day week 3 afternoon 7 hour 30 minute
90
The largest unit nian 'year' is in the first position. The smallest shi 'hour' and fen 'minute' come last. It follows the larger concept first principle.
For some time phrases, mostly those consisting of last,
next, yesterday, or tomorrow, English expressions are the same as Chinese phrase expressions, such as yesterday afternoon, tomorrow morning, etc. Chinese expression is the same order: zuotian xiawu, mingtian-zaoshang, because in these cases the
English expression order coincides with the larger concept first principle.
4.4.3.2.2 Location expression
The word order of location expressions in English is from more specific to more general position or small range to large range. In Chinese, the word order is the opposite, from more
general to more specific location and from large to small range. It follows the larger concept first principle. For example, to express an address, English will be
1426 North Main Street, Gainesville, Florida. The smallest unit comes first. In Chinese the order will be
Florida, Gainesville, North Main Street, 1426 The largest unit comes first.
For several locative phrases in a sentence, like the following:
(4.68) Little Mary lives in a hut near a river in a
remote area.
91
Its Chinese word order will be
(4.69) Little Mary lives in a remote area near a river in
a hut.
The locative phrases are ordered from the largest range to the smallest range.
4.5 Lexical Level Reconstruction
At the lexical level, the reconstruction takes as its input the results of sentential and phrasal-level processing.
After the phrase reconstruction, the sentences and phrases are English words but organized in the structure of Chinese. This
is the LCF. With this LCF, what remains for the translation task is to find correct word mapping from the source language to the target language to be followed by post processing. The word mapping will basically rely on the bilingual dictionary
with domain restrictions and semantic markers attached to word and contextual information. A word in the TL, though
corresponding to an SL word, may have some distinct properties of its own and thus require special treatment. The post processing is designed to handle these lexical level variations.
4.5.1 Noun
A noun in Chinese has a distinct feature which most English nouns do not have. That is, any Chinese noun, as long as it is preceded by a numeral or a demonstrative, must have
92
a measure word in between. Most nouns have their specific measure words. For example, human nouns generally can only be used with the measure word ge; a noun denoting a horse has to be used with Pi; and one denoting cattle, with tiao, etc.
one person --- > yi ge ren
five persons --- > wu ge ren
that student --- > na ge xuesheng
three horses --- > san piL ma seven cows --- > qi tiao niu
this table --- > zhe zhanr zhuozi
The Chinese noun does not have inf lection for number. Number is either expressed by a numeral or demonstrative or left to the interpretation of the context.
4.5.2 Verb
When translating an English verb into Chinese, one thing that needs not be considered is tense, as Chinese is an aspect language but not a tense language (Chao, 1968; Chu 1983; Li and Thompson, 1981). A tense relates the time of the occurrence of an event/situation to the time that
event/situation is brought in speech. In English, the past tense denotes that the time occurrence is before the time of speech:
(4.70) I played pingpong with him.
where the suf fix -ed signals that the act of playing took place before the time of speaking. Chinese has no tense
93
markers. The language does not use verb suf fixes f or the purpose of tense.
Aspect, on the other hand, refers not to the time
relation, but rather to how the situation itself is being viewed with respect to its own internal makeup (Li & Thompson, 1981).
Generally speaking, there are f our aspect markers in Chinese. They are
1. the perfective aspect marker -le
2. the experiential aspect marker -_quo
3. the progressive aspect marker zai
4. the durative marker -zhe
These four aspect markers with auxiliary words and
adverbial phrases sufficiently cover most of the English tenses/aspect system in translation to Chinese.
4.5.2.1 The perfective aspect
The perfective aspect marker -le in general can be used to express the English past tense, though it has more
functions than just expressing tense and aspect (Chang, 1985; Chu and Chang, 1987). Sentence (4.70) in Chinese will be
(4.71) wo he ta da le pingpong qiu.
I with he play LE pingpong ball
The English perfect tense can also be expressed by this perfective aspect marker in Chinese:
(4.72) a. She has found her daughter.
b. Ta zhao dao le tade nur
She found reach LE her daughter