MULTI-LEVEL MACHINE TRANSLATION METHOD
 
 
 
 
 
 
         DR. SATORU IKEHARA
ABSTRUCT
 
  A Multi-Level-Translation Method is developed for
machine translation. This method is based on an analy-sis of the speaker's recognition of both the subject  and the object, and a simultaneous analysis of the syn-tax and meaning. There are two sub-methods: a Sepa- rate- Recombine method and a Multi-Level Transfer Meth-od. The Separate-Recombine method analyzes the sub-  jective expression, extracts subjective emotions and  intentions, and recombines them into the target lan-  guage. The Multi-Level Transfer method transfers the remaining objective expressions into the target lan-  guage by three steps based on how abstract the sentencestructure is.
In these three steps, a special recognition structure
and an individual recognition structure are extracted
and transferred in that order, and the remaining ex-
pressions are transferred by general rules.
1. INTRODUCTION
   Unlike natural science which deals with nature,
research on natural language deals with a mental pro-
duct, "natural language". Many explanations have been
proposed for language depending upon the difference of
understanding the mental function. For instance,
Saussure's structuralism (Koerner 1973, Fages 1968,
Lepschy 1970) assumed that the mental function exists in an "a priori conceptural substance", and explained that the contents of the mental function is a lin-
guistic norm. However, his structuralism or ordinary
grammar (Iwanami 1977) cannot explain homographic ex-
pressions (why does the same sentence structure often
have different meanings). Chomsky (Apel 1971, Kazita
1976) said that a more abtract structure should be
considered as the meaning. He introduced a "deep
structure" assuming a common thought pattern for every
one. His idea focusses on the content of expression.
But, he separated the content from the object so that areflection theory was ignored and a dualism (Chomsky
1966, 1968) was asserted, where a form was confronted
with a content.
    Chomsky (1965) first tried to explain a deep
  structure only by a syntactic structure in his     standard theory. The major contradiction in his    idea about the explanation of homographic expres-   sions was pointed out by Katz (1970) and Postal.    Then, in the revised augumented standard theory    (Chomsky 1973), he changed his idea to approve     changes of meanings by transformations of expres-   sions. Therefore, meanings that had been explained  as existing in a deep structure, also came to exist   at a surface structure. Thus, his idea was changed  from the dualism of a form and a content to the
  dualism of contents.
   However, the way that the object should be, and
the recognition should be, is reflected by the form of
an expression, so that the form and the content depend on each other. And the speaker's recognition is
connected to an expression. Therefore, a deep struc-
ture need not be assumed as a semantic structure sepa-
rated from a surface structure. On the contrary, the
relation between the object, the speaker's recognition
and an expression should be considered as the meaning.
   A lot of current research (Uchida 1984, Muraki
1984) on machine translation follows a "Transforma-  tional Generative grammar", where a common structure of
meaning is assumed, which is represented by an inter-
mediate language independent from ordinary languages.
However, if the "Process Construction of a language"
(Tokieda 1941, Miura 1967), such as an object, a re-
cognition, and an expression, is taken into con-
sideration, it becomes apparent that only an object is
common to languages and the way of recognition differs
from person to person also from language to language.
Therefore, it is difficult to assume a deep structure
as a meaning in common between languages. The differ- ence of recognition structure (Morita 1981, Miura 1981,Anzai 1983) between the original language and the   target language should be considered for high quality
machine translations.
   Based on the "Process Construction Theory" of
language (Tokieda 1941, Miura 1967), this paper shows
the importance of two points in machine translation:
the contents expressed by a subjective expression and
an objective expression, and an undividable relation
between a sentence structure and a meaning. This is
the basis of the proposed new machine translation
method MTLM (Multi-Level Machine Translation Method).
2. PROCESS CONSTRUCTION OF A LANGUAGE
 
2-1 Introduction to the Process Construction Theory
 
   A language has been metaphysically explained as
a complex body of the complete world by formalism
and structuralism. An expression was explained by
the functions or forms of each part of the expres-
sion. Chomsky's explanation satisfactorinly notices  the content of expressions. However, the relation
between an object and a speaker's recognition was not considered. Before this idea, "the Process Construc- tion Theory", proposed by Tokieda (1941), explained  that a language is composed of three processes: an  object, a recognition, and an expression.
These processes are combined by a relation of the
causality. The way that object should be reflects
a speaker's recognition, and the way that a re-
cognition should be, is connected to an expression
through the language norm. The differences between
Chomsky and Tokieda are shown in Fig. 1
2-2 Speaker's Recognition and Representation
(1) Recognition of the Subject and the Object
   The world that a speaker recognized is com-
posed of a subject, namely the speaker him-self, and other objects as shown in Fig.2. A speaker recognizes
both his own state and the state of other objects,
and connects them to expressions. From this point
of view, Tokieda explained that a Japanese sentence
is composed of two kinds of expressions:
  ##"Subjective Expressions" directly represent
   subjective emotions and intentions.
   Japanese adverbs and "joshi" (a post posi-
   tional word functioning as an auxiliary to
   a main word) are used for these expressions.
  ##"Objective Expressions" represent concep-
   tualized objects. Nouns, verbs and adjectives
   are used for these expressions. When a speaker
   conceptualizes a subject (the speaker him-self),
   the subject can be represented by these objec-     tive expressions.
   The relationship of these expressions has been
pointed out by Port-Royal (Lancelet and Arnauld
1966) for Indo-European languages.
(2) Connection between a Recognition and an Ex-
   pression
   The object can be subdivided into the sub-
stance, the attribute and the relation, all of which
have many structures. These structures are reflected
in expressions through a speaker's recognition. The
substance has a hierarchical relation between a part
and the whole and so on. The attribute is related
to the substance. And, the relation is mainly con-
structed by the three kinds of relation: the re-
lation between substances, attributes and relations.
   These components and partial structures are com-
bined to construct a total structure in many ways in
a speaker's mind. The way of recognition depends on
the view point of a speaker. Every language has
it's own framework that represents such a recogni-
tion. 
   Thus, the original structure of the object was
not directly reflected in the expression but through  a speaker's recognition. The difference between
languages is the difference between the frameworks
of a speaker's recognition. Then, the relationship  between a subjective expression and an objective
expression in the original language should be re-
constructed and expressed in the target language.
2-3 Definition of Meaning
(1) Ordinary Explanations and the New Definition
   Many explanations of the meaning of an ex-
pression have been proposed in ordinary linguistics.
Katz(1970) explained that a meaning exists in a
dictionary. Grice (Miura 1981) said it was a
listener's behavior. Searle (Miyashita 1981) de-
fined it as a speaker's intention. Saussure (Koerner
1973) explained it by an "a priori conceptual sub-   stance". Moreover, and Wiseman (1965) said that we  should not think about meaning.
   Saussure separated a "lang" to be social and a
"parol" to be individual. Conversely, Chomsky se-
parated a content (a "deep structure") from a syn-
tactic structure. In this respect, their ideas are
different. However, they agree that: contents are
independent from the object, and a "lang" and a
"deep structure" are common to humans or human
groups. A lot of current research on machine
translation is based on their ideas and assumes a
deep structure to be the meaning. There are also
cases where a deep structure is explained as the way
that a object should be, differing from a speaker's
recognition.
   In opposition to these ideas, we define the
meaning of an expression to be the relations between
the object, the recognition and the expression. This definition is based on the fact that the way that a  subject and a object should be, are connected to an  expression through a speaker's recognition. Accord-  ingly, the meaning (i.e. relation) does not exist   without an expression. The ordinary meaning of a   word defined in a dictionary is not exactly a mean-  ing, but a language norm. Only when a word is used  in an expression, does the relation to a recognition  and an object arise. Then, the word acquires a mean- ing.
(2) Meaning of a Syntactic Structure
   A speaker consolidates his recognition and re-
presents it by an expression, using rules for words,
phrases and clauses. This comsolidation is based   upon language norms supported by a meaning. That is, the way that the object should be, reflects a speak-  er's recognition, and a speaker's recognition re-   flects an expression. This means that the syntactic  structure is combined with the object and the recog-  nition, and it means that the syntactic struc ture is part of the meaning. Unlike a "Transformational    Generative grammar" where a structure (a surface    structure) is opposed to a meaning (a deep struc-   ture), a surface structure can be thought as part of  a meaning. Therefore, transforming an expression,   strictly speaking, changes the meaning. A transfor-  mation can not leave the meaning unchanged. Trans-  formations in ordinary language processing are trans- formations to other approximated expressions.     Accordingly, an Element Composition method, which   tries to compose the whole meaning a part of the    meaning, neglects the meaning of a syntactic struc-  ture, and it can not prevent the meaning of a syn-   tactic structure being missed. The meaning of the   part of an expression can be interpretated only in   the context of whole sentences. Accordingly, the
rules of the meaning of a word used in the expression, can be determined only for the context as shown in
Fig. 3.
3. PROPOSAL OF MULTI-LEVEL MACHINE TRANSLATION METHOD
   A "competent sentence" can be defined as a sen-
tence that can be translated in isolation with only a
knowledge of language, namely grammer and a dictionary.The Multi-Level Machine Translation Method (MLMT
method) is proposed for competent sentences as shown
in Fig. 4. The MLMT method has two sub-methods
described in this section.
    General knowledge of a world and context anal-
  ysis among several sentences have often been     pointed out to be necessary in a machine transla-   tion. However, this knowledge and analysis are not  always necessary. About 90% of Japanese written
  sentences in practical use are "competent" sen-
  tences. Therefore, we first focus on the transla-
  tion of these sentences.
3-1 Separate and Recombine Method for the Subjective
   Expressions
   Japanese is classified as an agglutinative lan-
guage, so that "joshi" and adverbs are used for sub-
jective expressions. Conversely, English is an in-
flectional language, with subjective expressions
usually represented by inflections. Thus, Japanese
subjective expressions do not directly correspond to
English ones and it is difficult to translate word for word. First, the speaker's emotions and intentions areclassified into categories. An analysis determines
what kind of categories the subjective part of a
given Japanese sentence represents. Thus the origi-
nal Japanese sentences are transformed into basic
Japanese sentences. The Japanese sentences remain-
ing after the subjective expressions are extracted,
are objective expressions. These expressions are
translated into basic English sentences by the Multi-
Level Transfer method described in the next section.
Finally, the speaker's previously extracted emotions
and intentions are recombined with the basic English
sentences. Adverbs and prepositions are added and
nouns and verbs are inflected. Thus, information
about the subjective expressions separated from the
original Japanese sentences are recombined in creating
English sentences.
    It will be sufficient for information of sub-   jective expressions to be classifid by the amount   of the break-down required for a translation from
  Japanese to English. Therefore, strictly speaking,
  the objective expressions, remaining after ex-     tracting this subjective information, also have a   few subjective expressions. These subjective ex-   pressions are used to understand the relations of   sentence elements in the analysis process of sub-   jective expressions.
3-2 Abstraction of Structures of Objective Ex-
   pressions and the Multi-Level Transfer Method
   The way that the object should be, is repre-
sented in a basic Japanese sentence (an objective
expression) through a speaker's view point. A
speaker's recognition about an object has several
structures and these structures reflect the structure of the objective expressions. If we strictly con-
sider that changing an expression changes the meaning
and that a united process of syntactic structure and
meaning are needed for accurate translation, then
matching English expressions are needed for all
Japanese expressions. Clearly, this is not practical
because of the infinite number of expressions. Thus,
sentence structures are classified into three levels,
noting the strength of the link between the sentence
structures and the meanings. Sentence structures are transformed through a suitable method to one of the  levels.
(1) Specific Recognition Structures (An Idiomatic
   Expression Transfer Method)
   Expressions, such as idioms, have meanings which
can not be determined from their individual words.
Moreover, there are many Japanese phrases that corre- spond to single English words. These expressions are not idiomatic in Japanese but are idiomatic in     English.
   These kinds of expressions are especially diffi- cult for an Element Composition method to translate. 
Therefore, these expressions are completely trans-
ferred by Idiomatic Expression Transfer rules pre-
pared as a matched a pair of Japanese and English
expressions from pair pattern dictionary. An
idiomatic expression is made up of soveral words.
If these word combinations appear, an Idiomatic Ex-
pression Transfer method is preferentially applied.
(2) Individual Recognition Structure (a Semantic
   Valentz Pattern Transfer Method)
   More general structures are classified into
these categories. The appearance of written words is
completely fixed in the specific recognition struc-
ture. Conversely, an individual recognition sturc-
ture represents the structure where the appearance of
a single word is fixed and the other words are not
fixed but restricted by the semantic attributes of
the words. When the appearance of a declinable word
is fixed, the contents of other "bunsetsu" (Japansese
clauses) connected to the declinable word are re-
stricted in the use of "joshi" and the semantic
attributes of nouns. A case grammar (Fillmore 1975)
can be applied for these individual recognition
structures. However, a Valentz Pattern (Ishiwata
1983) Transfer method is suitable because it does
not need a deep structure, which is difficult to
define exactly. In a case grammar, some of the mean- ing of a syntactic structure is missed in the decision process of deep cases, wheras the meaning of a syn-
tactic structure can be transferred in a Valentz
Pattern Transfer method.
   This paper uses a Valentz Pattern Transfer
method augmented by restrictions of the attributes
of a word. This method is supported by a system of  pecise and mutually exclusive semantic attributes of  words. This method can transfer meanings which can  not be categorized by case grammar.
   Individual recognition structures are paired for
Japanese to English expressions registered in a
pattern dictionary. An English recognition structure
also has a related key word used in a translation and
other restrictions such as prepositions and semantic
attributes of nouns. So the ambiguity in selecting a
translation word will decrease. This method requires
many patterns, up to about ten thousand, and a precise system of semantic attributes. However, pattern    groups, which have the same key word, are independent of each other. Therefore, the consistency of each   rule does not need to be checked in principle, and the system of transfer rules can easily expand.
(3) General Recognition Structure (A General Pattern
   Transfer method)
   In the previous two structures, a special word  and its combination was considered as a pattern.    Here, more general patterns are considered. The    appearance of a word is not fixed. For instance,   patterns are calssified by verb types as a instan-   teneous verbs or verbs of state and so on. General  patterns corresponding to groups of verbs are pre-   pared. Rough translation cannot be avoided with this method because of the generality.
   For these three methods, the more special the
structure that a pattern has, the higher the quality  of the translation that can be expected. These    methods are applied to basic Japanese sentences (a
subjective expressions) in the order described above.
If any patterns relevant to a given Japanese sentence can not be found in an Idiomatic Expression Pattern  dictionary or a Semantic Valentz Pattern dictionary,  then a General Pattern is used, and the quality of   translation decreases. However, as the pair pattern  dictionary grows, the translation quality should
improve.
3-3 Construction of the MLMT (Multi-Level Machine
   Translation) Method
   The MLMT method consists of two sub-method, a Se- parate and Recombine method for subjective expressions,and a Multi-Level Transfer method for objective expres-sions, as shown in Fig. 5.
   This translation process is similar to transla-
tion by a human shown in Fig. 6. That is, in human
translations, a translator first experiences for
himself the speaker's experience described by a given sentence. This process is supported by the Japanses
norm that connect a speaker's recognition to Japanese
expressions. Thus, a translator understands the way
the objects should be and the speaker's emotions and
intentions towards them. In the MLMT method, an origi-nal Japanse sentence is separated into descriptions of the way the objects should be and the speaker's emo-
tions. The way objects should be are represented by a basic Japanse sentence (an objective expression) and
speaker's emotions are rearranged in a reference table. In human translation, the way objects should be are
next reorganized in the framework of English, and the
speaker's emotions are recombined with it. Similary,
in the MLMT method, the meanings of objects are trans-
ferred into English by the three levels of transfer
method. The speaker's emotions, rearranged in a refer-ence table, are recombined to give the final English
expressions.
   This method has the following characteristics.
Idiomatic patterns or Valentz patterns, by which syn-
tactic structure and meanings are represented, can be
used, not only for Japanese to English translation but
also, for Japanese sentence analysis. Therefore, thereshould be fewer ambiguities in the analysis than with the ordinary method. Moreover, transfer rules are   highly independent of each other, so the consistency  check is limited to a small range. The translation  system should expand easily.
4. Implementation of a Japanese to English Transla-    tion System                       The MLMT method has been implemented in the
"Automatic Language Translation System for Japanese to English" (ALT-J/E). This system first analyzes the
morphemes of a given sentence. A morpheme is a mean- ingful linguistic unit that does not contain any    smaller meaningful units. In this analysis, the bound-aries of words are determined, and the synactic fea-  tures of every word are determined. Dependencies bet- ween components of a sentence are subsequently deter- mined. A "nuit sentence" is extracted by a declinable word and it's related parts. A unit has only one de- clinable word. A "simple sentence" has a single dedi- nable word as the top of a tree structure for a sen-  tence. It sometimes has several declinable words as  the lower node of the tree structure.
   When a simple sentence is separated into several
unit sentences, the relations between declinable words of a simple sentence are preserved. After a unit sen- tence is extracted, it is dealt with as a unit of
analysis. A unit sentence is transformed to a basic  Japanese sentence after extracting subjective infor-
mation such as aspects, modes, tense and so on repre-
sented by the predicate. Patterns are used to analyze a Japanese sentence. When a simple sentence has
several choices for a unit sentence, every choice is  analyzed with patterns. Only the unit sentences which fit some of the patterns are used. This process de-
creases the ambiguity of analysis.
   After this analysis with patterns, the pattern to
be applied to every unit sentence has been determined, so the corresponding English pattern has also been
simultaneously determined. Then, basic English ex-
pressions can be easily obtained. The final English
sentence is generated by adding subjective information kept in a related talbe.
5. Conclusion
   A machine translation method called the MLMT
(Multi-Level Machine Translation) method was developed based on the Constructive Process theory of a natural
language.
   Problems of ordinary methods based on generative and transformation grammar were discussed. The im-
portance was shown for machine translations that rec-
ognitions about a subject and a object should be sep-
arated, and meanings connected to a syntactic struc-
ture should not be missed. The MLMT method consists
of two sub-method which correspond to these two ideas: a Separate and Recombine method for subjective ex-
pressions, and a Multi-Level Transfer method for objec-tive expressions.
   Ideally, to handle a syntactic syructure and its meaning as one unit, to produce high quality trans-
lation, all expression should be registered. The
characteristics of a natural language make this tech- nically inpractical. A technical compromise can be
summarized as follows: @sentence structures are
classified as patterns corresponding to abstraction
levels of a speaker's recognition. Asubjective ex-  pressions are separated from original sentences to
improve the ration of fitting patterns.
   The MLMT method was proposed for translating
"competent Japanese sentences" into English. But this mithod can be applied to other translations such as
English to Japanese or Japanese to chinese.
Acknowledgement
   The author wishes to thank Dr. Masahiro Miyazaki and Mr. Satoshi Shirai for their valuable discussions. He also wishes to thanr the members of their group for
implementing this method into the ALT-J/E system.
References:
Anzai T.(1983). 'Conception in English', Kodan-sha
  (in Japanese)                   Apel K.O. (1971). 'Noam Chomskys Sprachtheorie und die   Philosophie der Gegenwart', (Japanese issue by
  S. Iguch, Taishukan, 1976)           
Chomsky N. (1965). 'Aspects of Theory of Syntax', MIT   Press, Cambridge, Mass.
Chomsky N. (1966). 'Cartesian Linguistics', (Japanese
  issue by Kawamoto, Misuzu)
Chomsky N. (1968). 'Language and Mind', New York
Chomsky N. (1973). 'Conditions on Translations',
  Anderson and Kiparsky, pp.232-236
Fages J.B. (1968). 'Comprendre le structuralisme',
  Collection <<Regard>>, Privat (Japanese edition by
  H. Kato, Taishukan, 1972)
Fillmore C.J. (1975). 'Toward a Modern Theory of Case
  and Other Articles', Holt, Rinehart & Winston Inc.,  New York (Japanese edition by H. Tanaka and 
  M. Funakoshi, Sanseido, 1975)
Ishiwata T. (1983). 'Grammar and meaning T', Asakura-
  syoten (in Japanese)
Iwanami (1977). 'Japanese 6 (Grammar T),         7 (Grammar U)' (in Japanese)
Katz J.J. (1970). 'The Philosophy of Language', (Japa-
  nese edition by U. Nishiyama, Taishukan)
Kazita M, (1976). 'The Trace of Transformational
  Theory', Taishukan (in Japanese)
Koerner E.F.K. (1973). 'Ferdinand de Saussure',
  Braunschweig: Friedr. Vieweg+Sohn GmbH (Japanese
  edition by K. Yamanaka, Taishukan, 1982)
Lancelot C. and Arnauld A. (1966). 'Grammaire generale   et raisonnee, les fondements de lart de parler'
  (Japanese edition by H. Minamikata, Taishukan,
  1972)
Lepscky G.C. (1970). 'A survey of structural linguis-
  tics', Faber & Faber (Japanese edition by
  S. Sugata, Taishukan, 1975)
Miura T. (1967). 'The theory of Noesis and Linguistics  Vol.1〜3', Keiso-shobo (in Japanese)
Miure T. (ed.)(1981). 'Critique of Modern Linguistics',  Keiso-shobo (in Japanese)
Miyashita S. (1981). 'Searle's linguistics' (Critigue
  of Modern Linguistics, pp.121-135, ed. T. Miura,
  Keiso-shobo, in Japanese)
Morita Y. (1981). 'Conception by Japanese', Koki-sha
  (in Japanese)
Muraki K. (1984). 'Japanese to English Translation
  System PIVOT', Nikkei Electronics, 7-Des., pp195-
  220 (in Japanese)
Nagao M. (1983). 'Language Engineering', Shokodo (in
  Japanese)
Shank R.C. (1975). 'Conceptual Information Processing'
  North-Holland
Tokieda M. (1941). 'Kokugogaku Genron (Principles of    linguistics)', Iwanami (in Japanses)
Uchida H. (1984). 'Japanese to English Translation
  System ATLAS U', Nikkei Electronics, 17-Des.
Wiseman F. (1965). 'The Principles of Linguistic
  Philosophy', Macmillan and Co., Ltd., (Japanese
  issue by J. Kusunose, Taishukan)