An Approach to Machine Translation Method
based on Constructive Process Theory        
 
 
 
 
 
        SATORU IKEHARA
       MASAHIRO MIYAZAKI
        SATOSHI SHIRAI
         AKIO YOKOO
          
ABSTRACT
 
   The multi-level-translation method for
machine translation is proposed based onthe Constructive Process Theory.   It iscomposed  of two sub-methods:  a separate-recombine method and a multi-level transfermethod, which are based on an  analysisof the speaker's recognition of both thesubject and the object, and a simultaneousanalysis  of the syntax  and  meaning.
   The separate-recombine method analyzes
subjective expressions to extract emotionsand intentions, and then  recombines themin the target language.  The multi-leveltransfer  method  conveys  the remainingobjective expressions in the target languageby three levels of transfer, based on thedegree of sentence structure abstractness.
   Here, the  meanings aspect  of  an
expression's  structure is  considered toachieve  accuracy in  machine translationcompared  with the conventional  one.
1. INTRODUCTION
   Unlike natural science, which dealswith physical phenomena, research on natural
language deals with a mental product.  Many
explanations for language have been proposedbased on different interpretations of themental function.  For instance, Saussure's
structuralism (1)(2) assumes that the mentalfunction exists in an " a priori conceptualsubstance", and states that the content ofthe mental function is a linguistic norm.     Neither Saussure's, structuralism nor
ordinary grammar (3), can explain homographicexpressions (why the same sentence structureoften has different meanings). Chomsky(4)(5)proposed that a more abstract structure mustbe considered to be  the meaning of anexpression. He introduced a "deep structure",which assumes a common thought pattern forevery one.  Chomsky's idea focuses on thecontent of an expression, but he separatedthe content from the object so that thereflection theory was ignored and a dualism(6)(9) was asserted, in which form opposedcontent.  However, the state the objects,and how recognition actually  occurs arereflected by the form of expression, so thatthe form and the content are interdependent.In addition, the speaker's recognition isconnected to an expression.   Therefore, adeep structure need not be assumed as asemantic structure separated from a surfacestructure.  On the contrary, the relationbetween the object, the speaker's recognitionof it and the expression should be consideredas the true meaning.
   Most current  research on  machine
translation (8)(9)  follows  some form oftransformational  generative  grammar,  inwhich a common  structure of meaning isassumed and represented in an intermediatelanguage independent of ordinary languages.
On the contrary, if the "process constructionof a language" (10)(11) such asan object,recognition and an expression is considered,it becomes apparent that only the object iscommon to the languages and that the mannerof recognition differs from person to personas  well  as  from  language tolanguage.
Therefore, it is difficult to assume a deepstructure as having a meaning common todifferent languages. This  difference  inrecognition structure (12)(13) between theoriginal language and the target languageshould be considered in high quality machinetranslation.
2. PROCESS CONSTRUCTION OF A LANGUAGE
2-1 The Process Construction Theory
   Language is metaphysically explained as
a complex body of the complete world byformalism and structuralism.  An expressionis explained by the functions or forms ofeach part of its parts.
   Chomsky's explanation pays sufficientattention to  the content of expressions.However, the relation between the object andthe speaker's recognition is not considered.Prior to this, the Process  ConstructionTheory proposed by Tokieda (10), claimed thatlanguage  is composed of three processes:object, recognition and expression.  Theseprocesses  are combined  by  the law ofcausality.  The state of an object reflectsthe speaker's recognition, therefore, the waythe speaker recognizes that object results inan expression.   This  mental  link fromobject  to  recognition  and  finally toexpression  illustrates   the   ProcessConstruction Theory.    The  differencesbetween Chomsky and Tokieda's theories areshown in Fig. 1.
2-2 Speaker Recognition and Representation
(1) Recognition  of the Subject and the   Object
   The world recognized by a speaker iscomposed of a subject, namely the speakerhimself, and objects ( Fig. 2).  A speakerrecognizes both his own state and the stateof other objects, and  connects  them toexpressions.   Based  on  this, Tokiedaexplained  that a  Japanese  sentence iscomposed of subjective  expressions  andobjective expressions.   By his definition,subjective  expressions directly representsubjective emotions and intentions. Japaneseadverbs and "joshi" (post-positional  wordsfunctioning  as auxiliaries to main words)are used for these expressions.  
   Objective  expressions   representconceptualized objects.  Nouns, verbs  andadjectives are used for these expressions.
When a speaker conceptualizes the subject(the speaker himself), the  subject can berepresented by these objective expressions.
The relationship between  subjective  andobjective expressions has been pointed out byPort-Royal (15) for Indo-European languages.
(2) Connection between a Recognition and an   Expression
   An object can  be broken down  intosubstance, attribute and  relation, all ofwhich have many structures. These structuresare reflected in expressions through thespeaker's recognition.  The substance of anobject has a hierarchical relation between apart and the whole for example.  Attributesare related to the substance. Relations aremainly constructed of three further kinds ofrelations between substances, attributes andrelations. 
   These components and partial structuresare combined in many ways to create  atotal structure in the speaker's mind. Themanner of  recognition  depends  on  theviewpoint of the speaker.  Every languagehas its own framework for representing suchrecognition. Thus, the original structure ofthe object is not reflected directly in theexpression,  but  through  the  speaker'srecognition.
   The difference between languages is thedifference between the frameworks of thespeaker's recognition.   The relationshipbetween subjective expressions and objectiveexpressions in the original language shouldbe  reconstructed  and expressed  in thetarget language.
2-3 Definition of Meaning
(1) A New Definition
   Many explanations of the meaning of anexpression have been proposed.  Saussure (1)distinguished "lang" as having common socialsignificance and "parol" as having individualmeaning based  on individual experiences.Taking a different point of view, Chomskymade a distinction between semantic deepstructure and syntactic structure. Althoughtheir perspectives are different, they agreethat content is independent of the object,and that "lang" and deep structure are bothcommon to human beings or specific humangroups. A lot of current machine translationresearch is based on these ideas and assumesthat deep structure is meaning.  There arealso cases where deep structure is defined asthe state of an object, which differs fromthe speaker's recognition of it.
   On the contrary, we define the meaningof an expression as the relation between theobject, the recognition and the expression.This definition is based on the fact that howthe subject and an object actually exist isconnected to an  expression  through thespeaker's recognition.  Accordingly,   themeaning (i.e. relationship)  cannot  existwithout the expression. The ordinary meaningof a word as defined in a dictionary is notactually a meaning, but rather a languagenorm.  Only when a word is used in anexpression does the relation to a recognitionand an object arise giving the word a truemeaning.
(2) The Meaning of Syntactic Structure
   A speaker consolidates his recognitionand represents it by an expression, usingrules to form words, phrases and clauses.This consolidation is based upon languagenorms supported by a meaning.  That is, thestate of the object reflects a speaker'srecognition, and the speaker's recognition isreflected in an expression.   This  meansthat the syntactic structure is integral withthe object and the recognition, and that thesyntactic structure is part of the meaning.Unlike transformational generative grammar,where structure (a surface structure) is
distinguished  from  a  meaning  (a deepstructure), a surface  structure can  beconsidered part of the meaning.  Therefore,transforming  an  expression,  strictlyspeaking, changes the meaning. Transformationcannot  leave  the  meaning  unchanged.
Transformations  in  ordinary  languageprocessing are transformations to alternateapproximate expressions.
   Thus, the element composition method,
which tries to compose the whole meaningfrom its parts,  neglects the meaning ofthe syntactic structure. The meaning of apart of an expression can  be correctlyinterpreted only in the  context of wholesentences. Accordingly, the rules of themeaning of a word used in the expression,can be determined only for that particularcontext.
3.  MULTI-LEVEL MACHINE TRANSLATION METHOD
   A competent sentence is defined as asentence that can be translated in isolationusing  only linguistic knowledge, namely,grammar and a dictionary.  The multi-levelmachine translation method (MLMT) is proposedfor competent sentences.
3-1 Separate and Recombine Method
   Japanese  is  classified  as  anagglutinative language where  "joshi" andadverbs are used for subjective expressions.By  contrast, English  is an inflectionallanguage, with subjective expressions usuallyrepresented by inflections.  Thus, Japanesesubjective  expressions  do  not directlycorrespond  to  English  ones, and it isdifficult to translate word for word. 
   For this reason, the speaker's emotionsand intentions are first classified intocategories and then analyze what categoriesthe subjective part of a given Japanesesentence represents.  Thus,  the  originalJapanese sentences are transformed into basicJapanese sentences.  The Japanese sentencesremaining after the subjective expressionsare extracted are  objective expressions.   These expressions are translated intobasic English sentences by the multi-leveltransfer  method  described in the nextsection.  Finally, the speaker's previouslyextracted  emotions  and  intentions  arerecombined with the basic English sentences.Adverbs and prepositions are added and nounsand verbs inflected. In this way, informationabout the subjective  expressions separatedfrom the original Japanese sentences  isrecombined in creating theEnglish sentences.
3-2 The Multi-Level Transfer Method
   The state of the object is representedin a basic Japanese sentence (an objectiveexpression) from the speaker's viewpoint.
The speaker's recognition of an object hasseveral structures and these structures arereflected in the structure of the objectiveexpressions. If we strictly adhere to thethat changing  an expression changes themeaning and that an integrated process ofsyntactic structure and meaning are neededfor accurate translation, then a matchingEnglish  expression is needed  for everypossible Japanese expression. Clearly, thisis not practical because of the infinitenumber  of expressions.  Thus,  sentencestructures are classified into three levels,according to the strength of the link betweenthe sentence structure and the meaning.   Sentence structures are transformed by amethod suitable to the level.
(1) Specific  Recognition Structure Level   Idiomatic expressions  have meaningswhich cannot be determined from individualwords  causing  extreme  difficulty  intranslation  by  an  element compossitionmethod. These kind of expressions therefore
are completely transferred as matched pairsof  Japanese and  English expressions byidiomatic expression transfer rules.
(2) Individual Recognition Structure Level
   More general structures are classifiedinto this category.  In specific recognitionstructures, the words are completely fixed. However  in  an  individual  recognition
structure, a single word is fixed, and theother words are restricted by their semanticattributes.   When a declinable word isfixed, the  contents of  other "bunsetsu"(Japanese  clauses)  connected  to  thedeclinable word are restricted in the use of"joshi" and the semantic attributes of nouns.   A case grammar(16) could be applied forthese structures.   However, a  Valentzpattern (17) transfer method is more suitableas it  does not require use of a deepstructure, which is difficult to  defineprecisely.  In a case grammar, some of themeaning of a syntactic structure is missed inthe process of deep case selection, whereasthe meaning of a syntactic structure can betransferred intact using the Valentz patterntransfer method.
   This paper uses the Valentz PatternTransfer method augmented by restrictions ofa word attributes. This method is supportedby a system  of  precise  and  mutuallyexclusive  semantic  attributes of words.It can transfer meanings which cannot becategorized by case grammar.
   Individual recognition structures arepaired  for  corresponding  Japanese  andEnglish expressions registered in a pattern
dictionary similiarly to idiomatic transfer
rulls.
(3) General Recognition Structure  Level
   Compared  with  both  special  andindividual recognition structures, which arethought to be  comprised by a pattern ofspecial words  or special and associatedwords,  the general recognition structuredeals with a more comprehensive patterns. Inthis level, a word is not fixed. For example,patterns may be classified by verb type: i.e.instantaneous  or  stative, etc.  Generalpatterns corresponding to groups of verbs areprepared. With this method rough translationis inevitable due to generalization.
   Specificity in a structure correlateswith the quality of translation. The rule forthe three methods is that the more specificthe structure, the higher the quality oftranslation that can be expected.  Thesemethods are  applied  to  basic Japanesesentences (subjective expressions) in  theorder described  above. If  no  patternsrelevant to a given Japanese sentence canbe found in the dictionary of idiomaticexpression or a semantic Valentz pattern,then a general pattern is used but loses thehight translation quality. With the expansionof pair pattern dictionary, the translationquality is expected to improve.
3-3 Structure of the MLMT Method
   The MLMT  method  consists  of twosub-methods: a separate and recombine methodfor subjective expressions, and a multi-leveltransfer method for objective expressions(Fig. 3).
   This translation process is similar tomanual translation (Fig. 4).  Here, a humantranslator  first feels for himself,  thespeaker's experience as described by a givensentence.   This process is supported bythe Japanese norm that connects a speaker'srecognition to Japanese expressions.  Thus,the translator understands the state of theobjects  and  the speaker's emotions andintentions towards the objects. In the MLMTmethod, the original Japanese sentence isseparated into descriptions of the state ofthe objects and the speaker's emotions.  Thestate of the objects is  represented bya  basic Japanese sentence (an objectiveexpression) and the speaker's emotions arerearranged in a reference table.
   In human translation, the state of theobjects is then reorganized in the frameworkof English, and the speaker's emotions arerecombined with  it.  Similarly, in  theMLMT method, the meanings of objects aretransferred into English by the three levelsof the transfer method.  The  speaker'semotions, arranged in a reference table,are recombined to give the final Englishexpressions.
   In this method, syntactic structure andmeaning  are  represented,  by  idiomaticpatterns or Valentz patterns.  They can notonly  be used for Japanese and  Englishtranslation, but also for Japanese sentenceanalysis which resultsin fewer ambiguitiesthan the ordinarymethod.  Moreover, transferrules are highly independent of each other;therefore, the consistency check is limitedto a smaller range facililating the expansionof the translation system.
4.  Conclusion
   Based on the constructiv process theory
of natural language, the multi-level machinetranslation  method  was  proposed.
   For machine translation, the importanceof separating recognitions concerning subjectand  object,  and retaining the  meaningassociated with  syntactic  structure wasshown. The MLMT method consists of two sub-methods, which correspond  to  these twoideas: a separate and recombine method forsubjective expressions, and a multi-leveltransfer method for objective expressions.
   Ideally,  to  handle  a  syntacticstructure and its meaning as one unit andthus to produce  high quality translation,all  possible  expressions   should  beidentified  and included in the transferrules.  The open-ended  characteristics ofnatural language  make  this  technicallyimpractical.  As a  technical compromise,expression structures are classified intopatterns corresponding  to   abstractionlevels of speaker recognition, and subjectiveexpressions are separated from the originalsentences to improve the ratio of matchingpatterns.
   The MLMT method was  proposed  fortranslating  competent Japanese  sentencesinto English.    Proposed  ideas  aboutseparating  subjective  expression  andobjective expressions, and the importance ofthe meaning of syntactic structure can beapplied commonly to natural languages, thenMLMT method will also operate with othernatural languages.
 
Acknowledgment
   The authors wish to thank the othermembers of our research group for helpingto implement this method.
 
References:
(1) E. F. Koerner: Ferdinand de  Saussure,
  Braunschweig: Friedr. Vieweg+Shon GmbH,  1973 (Japanese edition by K. Yamanaka,  Taishukan, 1982)
(2) G. C. Lepscky: A survey of structural  linguistics, Faber & Faber, 1970(Japanese  edition by  S.Sugata, Taishukan, 1975)
(3) Iwanami: Japanese  6  (Grammar T),  7  (Grammar U), 1977  (In  Japanese)
(4) K. O. Apel: Noam Chomskys Sprachtheorie  und  die philosophie  der  Gegenwart,  1971, (Japanese  issue by  S.Iguchi,  Taishukan,  1976)
(5) M. Kazita: The Trace of Transformational  Theory (In Japanese),  Taishukan, 1976
(6) N. Chomsky:   Cartesian  Linguistics,  (Japanese edition by Kawamoto Misuzu,
  1966)
(7) N. Chomsky: Language and Mind, New York
  1968
(8) H. Uchida:  Japanese   to   English  Translation  System ATLAS U,  Nikkei
  Electronics,  17-Dec., 1984
(9) K. Muraki:  Japanese   to   English  Translation System PIVOT (In Japanese),  Nikkei Electronics, 7-Dec., pp. 195-220,  1984
(10) M.Tokieda: Kokugogaku Genron (principles   of Linguistics) (In Japanese), Iwanami,
   1941
(11) T. Miura: The Theory  of Noesis and   Linguistics (In Japanese), Vol. 1' 3,
   Keiso-Shobo, 1967
(12) Y. Morita: Conception  by   Japanese
   (In  Japanese),  Koki-sha,  1981
(13) T. Miura (ed.): Critique  of  Modern
   Linguistics, (In Japanese), Keiso-shobo,   1981
(14) T. Anzai: Conception   in   English
   (In japanese), Kodan-sha, 1983
(15) C. Lancelot and A. Arnauld: Grammaire
   generale et raisonnee, les fondements
   de l'art de  parler, 1966 (Japanese   edition by  H. Minamikata, Taishukan,   1972)
(16) C. J. Fillmore: Toward a Modern Theory   of  Case  and  Other Articles, Holt,   Rinehart & Winston Inc. ,  New York,   1975 (Japanese edition by H. Tanaka and   M. Funakoshi, Sanseido, 1975)
(17) T. Ishiwata: Grammar   and  Meaning   (In Japanese), T, Asakura-shoten, 1983
Satoru Ikehara
Senior  Research  Engineer, Supervisor inthe  NTT  Communications  and InformationProcessing Laboratories. Since joining theECL system in 1969, he has developed aformal  algebraic  manipulation  language,queuing  network  analysis  theory,  andnatural language processing system.  He ispresently developing a machine translationsystem. He received bachelor's degree, andmaster's degree, and Dr. Eng. degree fromOsaka University in 1967, 1969 and 1983.He was  awarded the  dissertation  prizein 1982 for his  research  on  queuingnetwork  analysis  from  the InformationProcessing Society. He is a member of theInstitute  of Electronics, Information  andCommunication  Engineers  of  Japan, andInformation Processing  Society of Japan.
MASAHIRO MIYAZAKI
Senior  Research  Engineer, in the  NTT
Communications and Information  ProcessingLaboratories. Since joining the ECL system in1969, he has developed the computer systemDIPS-11, performance evaluation theory forcomputer systems and Japanese-text-to-speech-systems.   He is  presently  developing amachine translation system.  He received abachelor's  degree  in  1969 and Dr. Eng.degree from Tokyo Institute of Technologyin 1986.   He  is  a member of   theInstitute  of Electronics, Information andCommunication Engineers of  Japan and theInformation Processing  Society of Japan.
SATOSHI SHIRAI
Senior  Research  Engineer,  in  the NTT
Communications and Information  ProcessingLaboratories. Since joining the ECL system in1980, he has developed Japanese analysissystems  for natural language  processingsystems.   He is presently developing a
machine translation  system.  He receivedbachelor's and  master's degrees from OsakaUniversity in 1978 and 1980.  He is a memberof the Institute of Electronics, Informationand Communication Engineers of Japan, and theInformation Processing Society  of Japan.
AKIO YOKOO
Research  Engineer  of the  NTT  NaturalLanguage Processing Laboratory in the NTT
Communications and Information  Processing
Laboratories. Since joining the ECL system in1982, he has developed a frame representation
language and a natural language processingsystems.   He is presently developing a
machine translation system.  He received a
bachelor's in 1980 and master's degree in1982  from  the  University of Electro-Communications.  He is a member of theInstitute  of Electronics, Information andCommunication  Engineers  of  Japan, theInformation Processing Society of Japan, andthe Japanese Society Artificial Intelligence.