MULTI-LEVEL MACHINE TRANSLATION METHOD

　　　　　　　　 DR. SATORU IKEHARA

ABSTRUCT

　　A Multi-Level-Translation Method is developed for

machine translation.　This method is based on an analy-sis of the speaker's recognition of both the subject　 and the object, and a simultaneous analysis of the syn-tax and meaning.　There are two sub-methods:　a Sepa-　rate- Recombine method and a Multi-Level Transfer Meth-od.　The Separate-Recombine method analyzes the sub-　 jective expression, extracts subjective emotions and　 intentions, and recombines them into the target lan-　 guage.　The Multi-Level Transfer method transfers the　remaining objective expressions into the target lan-　 guage by three steps based on how abstract the sentencestructure is.

In these three steps, a special recognition structure

and an individual recognition structure are extracted

and transferred in that order, and the remaining ex-

pressions are transferred by general rules.

1.　INTRODUCTION

　　 Unlike natural science which deals with nature,

research on natural language deals with a mental pro-

duct, "natural language".　Many explanations have been

proposed for language depending upon the difference of

understanding the mental function.　For instance,

Saussure's structuralism (Koerner 1973, Fages 1968,

Lepschy 1970) assumed that the mental function exists　in an "a priori conceptural substance", and explained　that the contents of the mental function is a lin-

guistic norm.　However, his structuralism or ordinary

grammar (Iwanami 1977) cannot explain homographic ex-

pressions (why does the same sentence structure often

have different meanings). Chomsky (Apel 1971, Kazita

1976) said that a more abtract structure should be

considered as the meaning.　He introduced a "deep

structure" assuming a common thought pattern for every

one.　His idea focusses on the content of expression.

But, he separated the content from the object so that areflection theory was ignored and a dualism (Chomsky

1966, 1968) was asserted, where a form was confronted

with a content.

　　　　Chomsky (1965) first tried to explain a deep

　 structure only by a syntactic structure in his　　　　 standard theory.　The major contradiction in his　　　 idea about the explanation of homographic expres-　　　sions was pointed out by Katz (1970) and Postal.　　　　Then, in the revised augumented standard theory　　　　(Chomsky 1973), he changed his idea to approve　　　　 changes of meanings by transformations of expres-　　　sions.　Therefore, meanings that had been explained　　as existing in a deep structure, also came to exist　　 at a surface structure.　Thus, his idea was changed　　from the dualism of a form and a content to the

　 dualism of contents.

　　 However, the way that the object should be, and

the recognition should be, is reflected by the form of

an expression, so that the form and the content depend on each other.　And the speaker's recognition is

connected to an expression.　Therefore, a deep struc-

ture need not be assumed as a semantic structure sepa-

rated from a surface structure.　On the contrary, the

relation between the object, the speaker's recognition

and an expression should be considered as the meaning.

　　 A lot of current research (Uchida 1984, Muraki

1984) on machine translation follows a "Transforma-　　tional Generative grammar", where a common structure of

meaning is assumed, which is represented by an inter-

mediate language independent from ordinary languages.

However, if the "Process Construction of a language"

(Tokieda 1941, Miura 1967), such as an object, a re-

cognition, and an expression, is taken into con-

sideration, it becomes apparent that only an object is

common to languages and the way of recognition differs

from person to person also from language to language.

Therefore, it is difficult to assume a deep structure

as a meaning in common between languages.　The differ- ence of recognition structure (Morita 1981, Miura 1981,Anzai 1983) between the original language and the　　　target language should be considered for high quality

machine translations.

　　 Based on the "Process Construction Theory" of

language (Tokieda 1941, Miura 1967), this paper shows

the importance of two points in machine translation:

the contents expressed by a subjective expression and

an objective expression, and an undividable relation

between a sentence structure and a meaning.　This is

the basis of the proposed new machine translation

method MTLM (Multi-Level Machine Translation Method).

2.　PROCESS CONSTRUCTION OF A LANGUAGE

2-1　Introduction to the Process Construction Theory

　　 A language has been metaphysically explained as

a complex body of the complete world by formalism

and structuralism.　An expression was explained by

the functions or forms of each part of the expres-

sion.　Chomsky's explanation satisfactorinly notices　 the content of expressions.　However, the relation

between an object and a speaker's recognition was not　considered.　Before this idea, "the Process Construc-　tion Theory", proposed by Tokieda (1941), explained　　that a language is composed of three processes:　an　　object, a recognition, and an expression.

These processes are combined by a relation of the

causality.　The way that object should be reflects

a speaker's recognition, and the way that a re-

cognition should be, is connected to an expression

through the language norm.　The differences between

Chomsky and Tokieda are shown in Fig. 1

2-2　Speaker's Recognition and Representation

(1)　Recognition of the Subject and the Object

　　　The world that a speaker recognized is com-

posed　of a subject, namely the speaker him-self, and　other objects as shown in Fig.2.　A speaker recognizes

both his own state and the state of other objects,

and connects them to expressions.　From this point

of view, Tokieda explained that a Japanese sentence

is composed of two kinds of expressions:

　　##"Subjective Expressions" directly represent

　　　subjective emotions and intentions.

　　　Japanese adverbs and "joshi" (a post posi-

　　　tional word functioning as an auxiliary to

　　　a main word) are used for these expressions.

　　##"Objective Expressions" represent concep-

　　　tualized objects.　Nouns, verbs and adjectives

　　　are used for these expressions.　When a speaker

　　　conceptualizes a subject (the speaker him-self),

　　　the subject can be represented by these objec-　　　　 tive expressions.

　　　The relationship of these expressions has been

pointed out by Port-Royal (Lancelet and Arnauld

1966) for Indo-European languages.

(2)　Connection between a Recognition and an Ex-

　　　pression

　　　The object can be subdivided into the sub-

stance, the attribute and the relation, all of which

have many structures.　These structures are reflected

in expressions through a speaker's recognition.　The

substance has a hierarchical relation between a part

and the whole and so on.　The attribute is related

to the substance.　And, the relation is mainly con-

structed by the three kinds of relation:　the re-

lation between substances, attributes and relations.

　　　These components and partial structures are com-

bined to construct a total structure in many ways in

a speaker's mind.　The way of recognition depends on

the view point of a speaker.　Every language has

it's own framework that represents such a recogni-

tion.　

　　　Thus, the original structure of the object was

not directly reflected in the expression but through　 a speaker's recognition.　The difference between

languages is the difference between the frameworks

of a speaker's recognition.　Then, the relationship　　between a subjective expression and an objective

expression in the original language should be re-

constructed and expressed in the target language.

2-3　Definition of Meaning

(1)　Ordinary Explanations and the New Definition

　　　Many explanations of the meaning of an ex-

pression have been proposed in ordinary linguistics.

Katz(1970) explained that a meaning exists in a

dictionary.　Grice (Miura 1981) said it was a

listener's behavior.　Searle (Miyashita 1981) de-

fined it as a speaker's intention.　Saussure (Koerner

1973) explained it by an "a priori conceptual sub-　　 stance".　Moreover, and Wiseman (1965) said that we　　should not think about meaning.

　　　Saussure separated a "lang" to be social and a

"parol" to be individual.　Conversely, Chomsky se-

parated a content (a "deep structure") from a syn-

tactic structure.　In this respect, their ideas are

different.　However, they agree that:　contents are

independent from the object, and a "lang" and a

"deep structure" are common to humans or human

groups.　A lot of current research on machine

translation is based on their ideas and assumes a

deep structure to be the meaning.　There are also

cases where a deep structure is explained as the way

that a object should be, differing from a speaker's

recognition.

　　　In opposition to these ideas, we define the

meaning of an expression to be the relations between

the object, the recognition and the expression.　This　definition is based on the fact that the way that a　　subject and a object should be, are connected to an　　expression through a speaker's recognition.　Accord-　 ingly, the meaning (i.e. relation) does not exist　　　without an expression.　The ordinary meaning of a　　　word defined in a dictionary is not exactly a mean-　　ing, but a language norm.　Only when a word is used　　in an expression, does the relation to a recognition　 and an object arise.　Then, the word acquires a mean-　ing.

(2)　Meaning of a Syntactic Structure

　　　A speaker consolidates his recognition and re-

presents it by an expression, using rules for words,

phrases and clauses.　This comsolidation is based　　　upon language norms supported by a meaning.　That is,　the way that the object should be, reflects a speak-　 er's recognition, and a speaker's recognition re-　　　flects an expression.　This means that the syntactic　 structure is combined with the object and the recog-　 nition, and it means that the syntactic struc ture is　part of the meaning.　Unlike a "Transformational　　　 Generative grammar" where a structure (a surface　　　 structure) is opposed to a meaning (a deep struc-　　　ture), a surface structure can be thought as part of　 a meaning.　Therefore, transforming an expression,　　 strictly speaking, changes the meaning.　A transfor-　 mation can not leave the meaning unchanged.　Trans-　　formations in ordinary language processing are trans-　formations to other approximated expressions.　　　　　Accordingly, an Element Composition method, which　　　tries to compose the whole meaning a part of the　　　 meaning, neglects the meaning of a syntactic struc-　　ture, and it can not prevent the meaning of a syn-　　 tactic structure being missed.　The meaning of the　　 part of an expression can be interpretated only in　　 the context of whole sentences.　Accordingly, the

rules of the meaning of a word used in the expression, can be determined only for the context as shown in

Fig. 3.

3.　PROPOSAL OF MULTI-LEVEL MACHINE TRANSLATION METHOD

　　 A "competent sentence" can be defined as a sen-

tence that can be translated in isolation with only a

knowledge of language, namely grammer and a dictionary.The Multi-Level Machine Translation Method (MLMT

method) is proposed for competent sentences as shown

in Fig. 4.　The MLMT method has two sub-methods

described in this section.

　　　　General knowledge of a world and context anal-

　 ysis among　several sentences have often been　　　　　pointed out to be necessary in a machine transla-　　　tion.　However, this knowledge and analysis are not　　always necessary.　About 90% of Japanese written

　 sentences in practical use are "competent" sen-

　 tences.　Therefore, we first focus on the transla-

　 tion of these sentences.

3-1　Separate and Recombine Method for the Subjective

　　 Expressions

　　 Japanese is classified as an agglutinative lan-

guage, so that "joshi" and adverbs are used for sub-

jective expressions.　Conversely, English is an in-

flectional language, with subjective expressions

usually represented by inflections.　Thus, Japanese

subjective expressions do not directly correspond to

English ones and it is difficult to translate word for word.　First, the speaker's emotions and intentions areclassified into categories.　An analysis determines

what kind of categories the subjective part of a

given Japanese sentence represents.　Thus the origi-

nal Japanese sentences are transformed into basic

Japanese sentences.　The Japanese sentences remain-

ing after the subjective expressions are extracted,

are objective expressions.　These expressions are

translated into basic English sentences by the Multi-

Level Transfer method described in the next section.

Finally, the speaker's previously extracted emotions

and intentions are recombined with the basic English

sentences.　Adverbs and prepositions are added and

nouns and verbs are inflected.　Thus, information

about the subjective expressions separated from the

original Japanese sentences are recombined in creating

English sentences.

　　　　It will be sufficient for information of sub-　　 jective expressions to be classifid by the amount　　　of the break-down required for a translation from

　 Japanese to English.　Therefore, strictly speaking,

　 the objective expressions, remaining after ex-　　　　 tracting this subjective information, also have a　　　few subjective expressions.　These subjective ex-　　　pressions are used to understand the relations of　　　sentence elements in the analysis process of sub-　　　jective expressions.

3-2　Abstraction of Structures of Objective Ex-

　　 pressions and the Multi-Level Transfer Method

　　 The way that the object should be, is repre-

sented in a basic Japanese sentence (an objective

expression) through a speaker's view point.　A

speaker's recognition about an object has several

structures and these structures reflect the structure　of the objective expressions.　If we strictly con-

sider that changing an expression changes the meaning

and that a united process of syntactic structure and

meaning are needed for accurate translation, then

matching English expressions are needed for all

Japanese expressions.　Clearly, this is not practical

because of the infinite number of expressions.　Thus,

sentence structures are classified into three levels,

noting the strength of the link between the sentence

structures and the meanings.　Sentence structures are　transformed through a suitable method to one of the　　levels.

(1)　Specific Recognition Structures (An Idiomatic

　　　Expression Transfer Method)

　　　Expressions, such as idioms, have meanings which

can not be determined from their individual words.

Moreover, there are many Japanese phrases that corre-　spond to single English words.　These expressions are　not idiomatic in Japanese but are idiomatic in　　　　 English.

　　　These kinds of expressions are especially diffi-　cult for an Element Composition method to translate.　

Therefore, these expressions are completely trans-

ferred by Idiomatic Expression Transfer rules pre-

pared as a matched a pair of Japanese and English

expressions from pair pattern dictionary.　An

idiomatic expression is made up of soveral words.

If these word combinations appear, an Idiomatic Ex-

pression Transfer method is preferentially applied.

(2)　Individual Recognition Structure (a Semantic

　　　Valentz Pattern Transfer Method)

　　　More general structures are classified into

these categories.　The appearance of written words is

completely fixed in the specific recognition struc-

ture.　Conversely, an individual recognition sturc-

ture represents the structure where the appearance of

a single word is fixed and the other words are not

fixed but restricted by the semantic attributes of

the words.　When the appearance of a declinable word

is fixed, the contents of other "bunsetsu" (Japansese

clauses) connected to the declinable word are re-

stricted in the use of "joshi" and the semantic

attributes of nouns.　A case grammar (Fillmore 1975)

can be applied for these individual recognition

structures.　However, a Valentz Pattern (Ishiwata

1983) Transfer method is suitable because it does

not need a deep structure, which is difficult to

define exactly.　In a case grammar, some of the mean-　ing of a syntactic structure is missed in the decision process of deep cases, wheras the meaning of a syn-

tactic structure can be transferred in a Valentz

Pattern Transfer method.

　　　This paper uses a Valentz Pattern Transfer

method augmented by restrictions of the attributes

of a word.　This method is supported by a system of　　pecise and mutually exclusive semantic attributes of　 words.　This method can transfer meanings which can　　not be categorized by case grammar.

　　　Individual recognition structures are paired for

Japanese to English expressions registered in a

pattern dictionary.　An English recognition structure

also has a related key word used in a translation and

other restrictions such as prepositions and semantic

attributes of nouns.　So the ambiguity in selecting a

translation word will decrease.　This method requires

many patterns, up to about ten thousand, and a precise system of semantic attributes.　However, pattern　　　 groups, which have the same key word, are independent　of each other.　Therefore, the consistency of each　　 rule does not need to be checked in principle, and the system of transfer rules can easily expand.

(3)　General Recognition Structure (A General Pattern

　　　Transfer method)

　　　In the previous two structures, a special word　　and its combination was considered as a pattern.　　　 Here, more general patterns are considered.　The　　　 appearance　of a word is not fixed.　For instance,　　 patterns are calssified by verb types as a instan-　　 teneous verbs or verbs of state and so on.　General　　patterns corresponding to groups of verbs are pre-　　 pared.　Rough translation cannot be avoided with this　method because of the generality.

　　　For these three methods, the more special the

structure that a pattern has, the higher the quality　 of the translation that can be expected.　These　　　　methods are applied to basic Japanese sentences (a

subjective expressions) in the order described above.

If any patterns relevant to a given Japanese sentence　can not be found in an Idiomatic Expression Pattern　　dictionary or a Semantic Valentz Pattern dictionary,　 then a General Pattern is used, and the quality of　　 translation decreases.　However, as the pair pattern　 dictionary grows, the translation quality should

improve.

3-3　Construction of the MLMT (Multi-Level Machine

　　 Translation) Method

　　 The MLMT method consists of two sub-method, a Se- parate and Recombine method for subjective expressions,and a Multi-Level Transfer method for objective expres-sions, as shown in Fig. 5.

　　 This translation process is similar to transla-

tion by a human shown in Fig. 6.　That is, in human

translations, a translator first experiences for

himself the speaker's experience described by a given　sentence.　This process is supported by the Japanses

norm that connect a speaker's recognition to Japanese

expressions.　Thus, a translator understands the way

the objects should be and the speaker's emotions and

intentions towards them.　In the MLMT method, an origi-nal Japanse sentence is separated into descriptions of the way the objects should be and the speaker's emo-

tions.　The way objects should be are represented by a basic Japanse sentence (an objective expression) and

speaker's emotions are rearranged in a reference table.　In human translation, the way objects should be are

next reorganized in the framework of English, and the

speaker's emotions are recombined with it.　Similary,

in the MLMT method, the meanings of objects are trans-

ferred into English by the three levels of transfer

method.　The speaker's emotions, rearranged in a refer-ence table, are recombined to give the final English

expressions.

　　 This method has the following characteristics.

Idiomatic patterns or Valentz patterns, by which syn-

tactic structure and meanings are represented, can be

used, not only for Japanese to English translation but

also, for Japanese sentence analysis.　Therefore, thereshould be fewer ambiguities in the analysis than with　the ordinary method.　Moreover, transfer rules are　　 highly independent of each other, so the consistency　 check is limited to a small range.　The translation　　system should expand easily.

4.　Implementation of a Japanese to English Transla-　　　 tion System　　　　　　　　　　　　　　　　　　　　　　 The MLMT method has been implemented in the

"Automatic Language Translation System for Japanese to English" (ALT-J/E).　This system first analyzes the

morphemes of a given sentence.　A morpheme is a mean-　ingful linguistic unit that does not contain any　　　 smaller meaningful units.　In this analysis, the bound-aries of words are determined, and the synactic fea-　 tures of every word are determined.　Dependencies bet- ween components of a sentence are subsequently deter-　mined.　A "nuit sentence" is extracted by a declinable word and it's related parts.　A unit has only one de-　clinable word.　A "simple sentence" has a single dedi- nable word as the top of a tree structure for a sen-　 tence.　It sometimes has several declinable words as　 the lower node of the tree structure.

　　 When a simple sentence is separated into several

unit sentences, the relations between declinable words of a simple sentence are preserved.　After a unit sen- tence is extracted, it is dealt with as a unit of

analysis.　A unit sentence is transformed to a basic　 Japanese sentence after extracting subjective infor-

mation such as aspects, modes, tense and so on repre-

sented by the predicate.　Patterns are used to analyze a Japanese sentence.　When a simple sentence has

several choices for a unit sentence, every choice is　 analyzed with patterns.　Only the unit sentences which fit some of the patterns are used.　This process de-

creases the ambiguity of analysis.

　　 After this analysis with patterns, the pattern to

be applied to every unit sentence has been determined, so the corresponding English pattern has also been

simultaneously determined.　Then, basic English ex-

pressions can be easily obtained.　The final English

sentence is generated by adding subjective information kept in a related talbe.

5.　Conclusion

　　 A machine translation method called the MLMT

(Multi-Level Machine Translation) method was developed based on the Constructive Process theory of a natural

language.

　　 Problems of ordinary methods based on generative　and transformation grammar were discussed.　The im-

portance was shown for machine translations that rec-

ognitions about a subject and a object should be sep-

arated, and meanings connected to a syntactic struc-

ture should not be missed.　The MLMT method consists

of two sub-method which correspond to these two ideas: a Separate and Recombine method for subjective ex-

pressions, and a Multi-Level Transfer method for objec-tive expressions.

　　 Ideally, to handle a syntactic syructure and its　meaning as one unit, to produce high quality trans-

lation, all expression should be registered.　The

characteristics of a natural language make this tech-　nically inpractical.　A technical compromise can be

summarized as follows: ①sentence structures are

classified as patterns corresponding to abstraction

levels of a speaker's recognition.　②subjective ex-　 pressions are separated from original sentences to

improve the ration of fitting patterns.

　　 The MLMT method was proposed for translating

"competent Japanese sentences" into English.　But this mithod can be applied to other translations such as

English to Japanese or Japanese to chinese.

Acknowledgement

　　 The author wishes to thank Dr. Masahiro Miyazaki　and Mr. Satoshi Shirai for their valuable discussions. He also wishes to thanr the members of their group for

implementing this method into the ALT-J/E system.

References:

Anzai T.(1983). 'Conception in English', Kodan-sha

　　(in Japanese)　　　　　　　　　　　　　　　　　　　Apel K.O. (1971). 'Noam Chomskys Sprachtheorie und die　　 Philosophie der Gegenwart', (Japanese issue by

　　S. Iguch, Taishukan, 1976)　　　　　　　　　　　

Chomsky N. (1965). 'Aspects of Theory of Syntax', MIT　　　Press, Cambridge, Mass.

Chomsky N. (1966). 'Cartesian Linguistics', (Japanese

　　issue by Kawamoto, Misuzu)

Chomsky N. (1968). 'Language and Mind', New York

Chomsky N. (1973). 'Conditions on Translations',

　　Anderson and Kiparsky, pp.232-236

Fages J.B. (1968). 'Comprendre le structuralisme',

　　Collection <<Regard>>, Privat (Japanese edition by

　　H. Kato, Taishukan, 1972)

Fillmore C.J. (1975). 'Toward a Modern Theory of Case

　　and Other Articles', Holt, Rinehart & Winston Inc.,　　New York (Japanese edition by H. Tanaka and　

　　M. Funakoshi, Sanseido, 1975)

Ishiwata T. (1983). 'Grammar　and meaning Ⅰ', Asakura-

　　syoten (in Japanese)

Iwanami (1977). 'Japanese 6 (Grammar Ⅰ),　　　　　　　　　7 (Grammar Ⅱ)' (in Japanese)

Katz J.J. (1970). 'The Philosophy of Language', (Japa-

　　nese edition by U. Nishiyama, Taishukan)

Kazita M, (1976). 'The Trace of Transformational

　　Theory', Taishukan (in Japanese)

Koerner E.F.K. (1973). 'Ferdinand de　Saussure',

　　Braunschweig: Friedr. Vieweg+Sohn GmbH (Japanese

　　edition by K. Yamanaka, Taishukan, 1982)

Lancelot C. and Arnauld A. (1966). 'Grammaire generale　　 et raisonnee, les fondements de lart de parler'

　　(Japanese edition by H. Minamikata, Taishukan,

　　1972)

Lepscky G.C. (1970). 'A survey of structural linguis-

　　tics', Faber & Faber (Japanese edition by

　　S. Sugata, Taishukan, 1975)

Miura T. (1967). 'The theory of Noesis and Linguistics　　Vol.1～3', Keiso-shobo (in Japanese)

Miure T. (ed.)(1981). 'Critique of Modern Linguistics',　　Keiso-shobo (in Japanese)

Miyashita S. (1981). 'Searle's linguistics' (Critigue

　　of Modern Linguistics, pp.121-135, ed. T. Miura,

　　Keiso-shobo, in Japanese)

Morita Y. (1981). 'Conception by Japanese', Koki-sha

　　(in Japanese)

Muraki K. (1984). 'Japanese to English Translation

　　System PIVOT', Nikkei Electronics, 7-Des., pp195-

　　220 (in Japanese)

Nagao M. (1983). 'Language Engineering', Shokodo (in

　　Japanese)

Shank R.C. (1975). 'Conceptual Information Processing'

　　North-Holland

Tokieda M. (1941). 'Kokugogaku Genron (Principles of　　　 linguistics)', Iwanami (in Japanses)

Uchida H. (1984). 'Japanese to English Translation

　　System ATLAS Ⅱ', Nikkei Electronics, 17-Des.

Wiseman F. (1965). 'The Principles of Linguistic

　　Philosophy', Macmillan and Co., Ltd., (Japanese

　　issue by J. Kusunose, Taishukan)