An Approach to Machine Translation Method

based on Constructive Process Theory　　　　　　　　

　　　　　　　 SATORU IKEHARA

　　　　　　　MASAHIRO MIYAZAKI

　　　　　　　 SATOSHI SHIRAI

　　　　　　　　 AKIO YOKOO

ABSTRACT

　　　The multi-level-translation method　for

machine　translation　is　proposed　based　onthe　Constructive　Process　Theory.　　 It iscomposed　 of　two sub-methods:　 a　separate-recombine method and a multi-level　transfermethod,　which　are　based　on　an　 analysisof the　speaker's　recognition　of　both　thesubject　and the object,　and a　simultaneousanalysis　 of　the syntax　 and　　meaning.

　　　The separate-recombine method　analyzes

subjective　expressions　to　extract emotionsand　intentions, and　then　 recombines　themin　the　target　language.　 The　multi-leveltransfer　 method　 conveys　　the　remainingobjective expressions in　the target languageby three levels　of transfer,　based　on　thedegree　of sentence　structure　abstractness.

　　　Here,　the　 meanings　aspect　 of　 an

expression's　 structure　is　 considered　toachieve　 accuracy　in　 machine　translationcompared　 with　the　conventional　 one.

1.　INTRODUCTION

　　　Unlike　natural　science,　which　dealswith physical phenomena, research on　natural

language deals with a mental product.　　Many

explanations for language have been　proposedbased　on　different　interpretations　of themental function.　 For　instance,　Saussure's

structuralism (1)(2) assumes　that the mentalfunction　exists in　an " a priori conceptualsubstance", and states　that　the　content ofthe mental function is　a linguistic norm.　　　　 Neither　Saussure's,　structuralism nor

ordinary grammar (3), can explain homographicexpressions　(why the same sentence structureoften has different meanings).　Chomsky(4)(5)proposed that a more abstract structure　mustbe　considered　to　be　 the　meaning　of　anexpression. He introduced a "deep structure",which　assumes　a common thought pattern　forevery one.　　Chomsky's　idea focuses on　thecontent　of　an　expression, but he separatedthe　content　from　the　object　so　that thereflection theory　was ignored　and a dualism(6)(9)　was　asserted, in　which form opposedcontent.　　However, the　state　the objects,and　how　recognition　actually　 occurs　arereflected by the form　of expression, so thatthe form and the　content are interdependent.In　addition,　the　speaker's　recognition isconnected to an expression.　　 Therefore,　adeep structure　need　not　be　assumed　as　asemantic structure separated from　a　surfacestructure.　 On　the　contrary, the　relationbetween the object, the speaker's recognitionof it and the expression should be consideredas the true meaning.

　　　Most　current　 research　on　　machine

translation (8)(9)　 follows　 some　form　oftransformational　 generative　 grammar,　 inwhich　a　common　 structure　of　meaning　isassumed　and　represented in an　intermediatelanguage independent　of ordinary　languages.

On the contrary, if the "process constructionof　a language" (10)(11)　such　asan　object,recognition and an expression　is considered,it becomes apparent that only　the object　iscommon to the　languages and that the　mannerof recognition differs from person to　personas　 well　 as　 from　 language　tolanguage.

Therefore, it is difficult　to assume　a deepstructure　as　having　a　meaning　common　todifferent languages.　This　 difference　　inrecognition　structure (12)(13)　between　theoriginal　language　and　the　target languageshould be considered in high quality　machinetranslation.

2.　PROCESS CONSTRUCTION OF A LANGUAGE

2-1　The Process Construction Theory

　　　Language is metaphysically explained as

a　complex body　of　the　complete　world　byformalism　and structuralism.　 An expressionis explained　by　the functions　or　forms ofeach part of its parts.

　　　Chomsky's explanation　pays　sufficientattention　to　 the content　of　expressions.However, the relation between the　object andthe　speaker's recognition is not considered.Prior　to　this,　the　Process　 ConstructionTheory proposed by Tokieda (10), claimed thatlanguage　 is　composed　of　three processes:object,　recognition　and expression.　 Theseprocesses　　are　combined　 by　 the law　ofcausality.　 The state of　an object reflectsthe speaker's recognition, therefore, the waythe speaker recognizes that object results inan expression.　　 This　 mental　 link　fromobject　 to　 recognition　 and　 finally　toexpression　　illustrates　　 the　　 ProcessConstruction Theory.　　　 The　　differencesbetween　Chomsky and Tokieda's　theories　areshown in　Fig. 1.

2-2　Speaker　Recognition　and Representation

(1)　Recognition　 of　the　Subject　and the　　　Object

　　　The world　recognized　by a speaker　iscomposed　of　a subject, namely　the　speakerhimself, and　objects ( Fig. 2).　　A speakerrecognizes　both his own state and　the stateof　other objects,　and　 connects　 them　toexpressions.　　　Based　 on　 this,　Tokiedaexplained　 that　a　 Japanese　 sentence　iscomposed　of　subjective　 expressions　　andobjective expressions.　　 By his definition,subjective　 expressions　directly　representsubjective emotions and intentions.　Japaneseadverbs and　"joshi" (post-positional　 wordsfunctioning　 as　auxiliaries　to main words)are used for these expressions.　　

　　　Objective　　expressions　　　representconceptualized objects.　 Nouns,　verbs　 andadjectives　are used for　these　expressions.

When　a speaker　conceptualizes　the　subject(the speaker himself), the　 subject　can　berepresented by these　objective　expressions.

The　relationship　between　 subjective　 andobjective expressions has been pointed out byPort-Royal (15)　for Indo-European languages.

(2)　Connection between a Recognition and an　　　Expression

　　　An　object　can　 be broken down　 intosubstance, attribute and　 relation,　all　ofwhich have many structures.　These structuresare　reflected　in　expressions　through　thespeaker's recognition.　 The　substance of anobject has a hierarchical　relation between apart and the whole for example.　　Attributesare related　to the substance.　Relations aremainly constructed of three further kinds　ofrelations between substances, attributes　andrelations.　

　　　These components and partial structuresare　combined　in　many　ways　to　create　 atotal structure in the　speaker's　mind.　Themanner　of　 recognition　 depends　 on　 theviewpoint　of the　speaker.　 Every　languagehas its　own framework for representing　suchrecognition.　Thus, the original structure ofthe　object is not reflected　directly in theexpression,　 but　 through　 the　 speaker'srecognition.

　　　The difference between languages is thedifference　between　the　frameworks　of　thespeaker's　recognition.　　 The　relationshipbetween subjective expressions and　objectiveexpressions　in the original language　shouldbe　 reconstructed　 and　expressed　 in　thetarget　language.

2-3　Definition of Meaning

(1)　A New Definition

　　　Many explanations of the meaning of　anexpression have been proposed.　 Saussure (1)distinguished "lang" as having common　socialsignificance and "parol" as having individualmeaning　based　 on　individual　experiences.Taking　a　different point　of view,　Chomskymade　a　distinction　between　semantic　deepstructure and　syntactic structure.　Althoughtheir perspectives are different, they　agreethat content is independent　of　the　object,and that "lang" and deep structure　are　bothcommon　to　human　beings or　specific　humangroups.　A lot of current machine translationresearch is based on　these ideas and assumesthat deep structure is meaning.　　There　arealso cases where deep structure is defined asthe state of an object,　which　differs　fromthe speaker's recognition of it.

　　　On the contrary, we define the　meaningof an expression as the relation　between theobject, the recognition and　the　expression.This definition is based on the fact that howthe subject and an　object　actually exist isconnected　to　an　 expression　 through　thespeaker's recognition.　 Accordingly,　　 themeaning　(i.e. relationship)　 cannot　 existwithout the expression.　The ordinary meaningof a word as defined in　a　dictionary is notactually　a　meaning, but rather　a　languagenorm.　 Only　when　a　word　is　used　in　anexpression does the relation to a recognitionand an object arise　giving　the　word a truemeaning.

(2)　The Meaning of Syntactic Structure

　　　A speaker consolidates his　recognitionand represents　it by　an　expression,　usingrules　to　form　words, phrases　and clauses.This　consolidation　is based　upon　languagenorms supported by a meaning.　 That　is, thestate　of　the　object　reflects　a speaker'srecognition, and the speaker's recognition isreflected in an　expression.　　 This　 meansthat the syntactic structure is integral withthe object and the recognition,　and that thesyntactic structure is　part of the　meaning.Unlike transformational　generative　grammar,where　structure　(a　surface　structure)　is

distinguished　 from　 a　 meaning　 (a　deepstructure),　a　surface　 structure　can　 beconsidered part of the meaning.　　Therefore,transforming　　an　　expression,　　strictlyspeaking, changes the meaning. Transformationcannot　　leave　 the　　meaning　 unchanged.

Transformations　　in　　ordinary　　languageprocessing are transformations　to　alternateapproximate expressions.

　　　Thus,　the element　composition method,

which　tries　to compose　the　whole　meaningfrom its　parts,　 neglects　the　meaning　ofthe　syntactic　structure.　The meaning　of apart　of　an　expression　can　 be　correctlyinterpreted　only　in the　 context　of wholesentences.　Accordingly,　the　rules　of　themeaning　of a　word　used　in the expression,can　be determined only　for that　particularcontext.

3.　 MULTI-LEVEL MACHINE TRANSLATION METHOD

　　　A competent　sentence is　defined　as asentence that can be translated in　isolationusing　 only　linguistic　knowledge,　namely,grammar and a dictionary.　　The　multi-levelmachine translation method (MLMT) is proposedfor　competent　sentences.

3-1　Separate and Recombine Method

　　　Japanese　 is　　classified　　as　　anagglutinative　language　where　 "joshi"　andadverbs are used for subjective　expressions.By　 contrast, English　 is　an　inflectionallanguage, with subjective expressions usuallyrepresented by inflections.　　Thus, Japanesesubjective　 expressions　 do　 not　directlycorrespond　 to　 English　 ones,　and　it isdifficult　to translate　word　for　word.　

　　　For this reason, the speaker's emotionsand　intentions　are　first　classified　intocategories and then analyze　what　categoriesthe　subjective　part　of　a　given　Japanesesentence represents.　 Thus,　 the　 originalJapanese sentences are transformed into basicJapanese sentences.　　The Japanese sentencesremaining　after the　subjective　expressionsare　extracted　are　 objective　expressions.　　　These expressions are　translated　intobasic English　sentences　by　the multi-leveltransfer　　method　 described　in　the　nextsection.　 Finally, the speaker's　previouslyextracted　 emotions　 and　 intentions　 arerecombined with the basic English　sentences.Adverbs and prepositions are　added and nounsand verbs inflected. In this way, informationabout the subjective　 expressions　separatedfrom　the　original　Japanese　sentences　 isrecombined in　creating theEnglish sentences.

3-2　The Multi-Level Transfer Method

　　　The state of the　object is representedin　a　basic　Japanese sentence (an objectiveexpression)　from the　speaker's　viewpoint.

The　speaker's　recognition of an object　hasseveral structures　and these　structures arereflected in the　structure of the　objectiveexpressions.　If　we　strictly　adhere to thethat　changing　 an　expression　changes　themeaning　and　that　an integrated process　ofsyntactic　structure and　meaning are　neededfor　accurate　translation,　then　a matchingEnglish　 expression　is　needed　 for　everypossible Japanese expression.　Clearly,　thisis　not　practical　because　of　the infinitenumber　 of　expressions.　　Thus,　 sentencestructures are　classified into three levels,according to the strength of the link betweenthe sentence　structure and the meaning.　　 Sentence　structures　are　transformed　by　amethod suitable to the level.

(1) Specific　 Recognition　Structure　Level　　　Idiomatic　expressions　 have　meaningswhich cannot　be　determined　from individualwords　　causing　　extreme　 difficulty　 intranslation　 by　 an　 element　compossitionmethod.　These kind of expressions　therefore

are completely transferred as　matched　pairsof　 Japanese　and　 English　expressions　byidiomatic expression transfer rules.

(2)　Individual Recognition Structure　Level

　　　More general　structures are classifiedinto this category.　 In specific recognitionstructures, the　words are completely fixed. However　 in　 an　 individual　　recognition

structure,　a single word　is　fixed, and theother words are restricted by their　semanticattributes.　　 When　a　declinable　word　isfixed,　the　 contents　of　 other "bunsetsu"(Japanese　　clauses)　 connected　　to　 thedeclinable word are　restricted in the use of"joshi" and the semantic attributes of nouns.　　　A case grammar(16) could be applied forthese structures.　　　However,　a　　Valentzpattern (17) transfer method is more suitableas　it　 does　not　require　use　of　a　deepstructure,　which　is　difficult　to　 defineprecisely.　　In a case grammar, some　of themeaning of a syntactic structure is missed inthe　process of deep case selection,　whereasthe meaning of a syntactic structure　can　betransferred intact using the Valentz　patterntransfer　method.

　　　This　paper　uses　the　Valentz PatternTransfer method augmented by restrictions　ofa　word attributes. This method is　supportedby　a　system　 of　 precise　 and　 mutuallyexclusive　 semantic　 attributes　of　words.It　can　transfer　meanings　which　cannot becategorized　by　case　grammar.

　　　Individual　recognition　structures arepaired　 for　 corresponding　 Japanese　 andEnglish expressions　registered in a　pattern

dictionary similiarly　to idiomatic　transfer

rulls.

(3)　General　Recognition　Structure　 Level

　　　Compared　 with　　both　 special　 andindividual　recognition structures, which arethought　to　be　 comprised　by　a pattern ofspecial　words　 or　special　and　associatedwords,　 the　general　recognition　structuredeals with a more comprehensive patterns.　Inthis level, a word is not fixed. For example,patterns may be classified by verb type: i.e.instantaneous　 or　 stative,　etc.　 Generalpatterns corresponding to groups of verbs areprepared.　With this method rough translationis　inevitable　due　to　generalization.

　　　Specificity in　a structure　correlateswith the quality of translation. The rule forthe three methods　is that the more　specificthe　structure, the　higher　the　quality　oftranslation　that　can　be　expected.　 Thesemethods　are　 applied　 to　 basic　Japanesesentences (subjective　expressions)　in　 theorder　described　 above.　If　 no　 patternsrelevant　to　a given　Japanese　sentence canbe　found　in　the　dictionary　of　idiomaticexpression　or　a semantic　Valentz　pattern,then a general pattern is used but loses　thehight translation quality. With the expansionof pair pattern dictionary,　the　translationquality is expected to improve.

3-3　Structure of the MLMT　Method

　　　The　MLMT　 method　 consists　 of　twosub-methods: a separate and recombine　methodfor subjective expressions, and a multi-leveltransfer　method　for　objective　expressions(Fig. 3).

　　　This translation process is　similar tomanual translation (Fig. 4).　 Here,　a humantranslator　 first　feels　for himself,　 thespeaker's experience　as described by a givensentence.　　 This　process　is supported　bythe　Japanese norm that connects a　speaker'srecognition to Japanese　expressions.　 Thus,the translator　understands the state of　theobjects　 and　 the　speaker's　emotions　andintentions towards the objects.　In　the MLMTmethod,　the　original　Japanese　sentence isseparated into descriptions　of the　state ofthe objects and the speaker's emotions.　 Thestate　of　the　objects　is　 represented　bya　 basic　Japanese　sentence　(an　objectiveexpression) and the　speaker's　emotions　arerearranged in a reference table.

　　　In human translation, the state　of theobjects is then reorganized in the　frameworkof English,　and the　speaker's　emotions arerecombined　with　 it.　 Similarly,　in　 theMLMT　method,　the　meanings　of objects　aretransferred into English by the　three levelsof　the　transfer　method.　　The　 speaker'semotions,　arranged　in　a　reference　table,are recombined　to　give　the　final　Englishexpressions.

　　　In this method, syntactic structure andmeaning　 are　 represented,　 by　 idiomaticpatterns or Valentz patterns.　　They can notonly　 be　used　for　Japanese　and　 Englishtranslation,　but also for　Japanese sentenceanalysis　which resultsin　fewer　ambiguitiesthan the ordinarymethod.　 Moreover, transferrules are highly　independent of each　other;therefore,　the consistency check　is limitedto a smaller range facililating the expansionof the translation system.

4.　 Conclusion

　　　Based on the constructiv process theory

of natural　language, the multi-level machinetranslation　 method　 was　 proposed.

　　　For machine translation, the importanceof separating recognitions concerning subjectand　 object,　 and　retaining　the　 meaningassociated　with　 syntactic　 structure　wasshown.　The MLMT method consists　of two sub-methods,　which　correspond　 to　 these　twoideas: a separate　and　recombine　method forsubjective　expressions,　and　a　multi-leveltransfer　method　for objective　expressions.

　　　Ideally,　　to　 handle　 a　 syntacticstructure and its　meaning　as　one unit　andthus to produce　 high　quality　translation,all　　possible　 expressions　　 should　 beidentified　 and　included　in　the　transferrules.　 The open-ended　 characteristics　ofnatural　language　 make　 this　 technicallyimpractical.　 As　a　 technical　compromise,expression　structures　are　classified　intopatterns　corresponding　　to　　 abstractionlevels of speaker recognition, and subjectiveexpressions　are separated from　the originalsentences　to　improve　the ratio of matchingpatterns.

　　　The　MLMT　method　was　 proposed　 fortranslating　 competent　Japanese　 sentencesinto English.　　　　Proposed　 ideas　 aboutseparating　　subjective　　expression　　andobjective　expressions, and the importance ofthe　meaning　of syntactic　structure　can beapplied commonly　to　natural languages, thenMLMT　method　will　also　operate　with othernatural languages.

Acknowledgment

　　　The　authors　wish　to　thank the othermembers　of our　research　group　for helpingto implement this method.

References:

(1) E. F. Koerner:　Ferdinand　de　 Saussure,

　　Braunschweig: Friedr.　Vieweg+Shon　GmbH,　　1973 (Japanese　edition　by　K. Yamanaka,　　Taishukan, 1982)

(2) G. C. Lepscky:　A　survey　of　structural　　linguistics, Faber & Faber, 1970(Japanese　　edition　by　 S.Sugata,　Taishukan, 1975)

(3) Iwanami: Japanese　 6　 (Grammar T),　　7　　(Grammar　U),　1977　 (In　 Japanese)

(4) K. O. Apel: Noam　Chomskys　Sprachtheorie　　und　 die　philosophie　 der　 Gegenwart,　　1971,　(Japanese　 issue　by　　S.Iguchi,　　Taishukan,　　1976)

(5) M. Kazita: The Trace of　Transformational　　Theory　(In Japanese),　 Taishukan,　1976

(6) N. Chomsky:　　 Cartesian　　Linguistics,　　(Japanese　edition　by　Kawamoto　Misuzu,

　　1966)

(7) N. Chomsky: Language and　Mind,　New York

　　1968

(8) H. Uchida:　 Japanese　　 to　　　English　　Translation　　System　ATLAS　U,　 Nikkei

　　Electronics,　 17-Dec., 1984

(9) K. Muraki:　 Japanese　　 to　　　English　　Translation　System　PIVOT (In Japanese),　　Nikkei Electronics, 7-Dec.,　pp. 195-220,　　1984

(10) M.Tokieda: Kokugogaku Genron (principles　　 of Linguistics) (In Japanese),　Iwanami,

　　 1941

(11) T. Miura:　The　Theory　 of　Noesis　and　　 Linguistics　(In　Japanese),　Vol. 1' 3,

　　 Keiso-Shobo, 1967

(12) Y. Morita: Conception　　by　　 Japanese

　　 (In　 Japanese),　 Koki-sha,　 1981

(13) T. Miura (ed.):　Critique　 of　　Modern

　　 Linguistics, (In Japanese), Keiso-shobo,　　 1981

(14) T. Anzai:　Conception　　 in　　 English

　　 (In japanese),　Kodan-sha, 1983

(15) C. Lancelot　and　A. Arnauld:　Grammaire

　　 generale　et　raisonnee,　les fondements

　　 de　l'art　de　 parler,　1966　(Japanese　　 edition　by　 H.　Minamikata, Taishukan,　　 1972)

(16) C. J. Fillmore: Toward　a　Modern Theory　　 of　 Case　 and　 Other　Articles, Holt,　　 Rinehart　&　Winston　Inc. ,　 New York,　　 1975 (Japanese　edition by H. Tanaka and　　 M. Funakoshi,　Sanseido,　1975)

(17) T. Ishiwata:　Grammar　　 and　　Meaning　　 (In Japanese),　T, Asakura-shoten,　1983

Satoru Ikehara

Senior　 Research　 Engineer,　Supervisor　inthe　 NTT　 Communications　 and　InformationProcessing　Laboratories.　Since　joining theECL　system　in　1969,　he　has　developed　aformal　 algebraic　 manipulation　 language,queuing　 network　 analysis　　theory,　 andnatural　language processing　system.　 He ispresently　developing　a　machine translationsystem.　He　received bachelor's　degree, andmaster's　degree, and　Dr. Eng.　degree　fromOsaka　University　in 1967,　1969　and　1983.He　was　 awarded the　 dissertation　　prizein　1982　for his　 research　　on　　queuingnetwork　 analysis　　from　　the InformationProcessing　Society.　He is　a　member of theInstitute　 of Electronics, Information　 andCommunication　　Engineers　 of　 Japan,　andInformation　Processing　 Society　of　Japan.

MASAHIRO MIYAZAKI

Senior　 Research　　Engineer,　in　the　 NTT

Communications　and　Information　 ProcessingLaboratories. Since joining the ECL system in1969,　he has　developed the computer　systemDIPS-11,　performance　evaluation　theory forcomputer systems and Japanese-text-to-speech-systems.　　 He is　 presently　 developing amachine　translation　system.　 He received abachelor's　 degree　　in　 1969 and Dr. Eng.degree　from　Tokyo Institute　of　Technologyin 1986.　　 He　 is　　a　member　of　　 theInstitute　 of　Electronics,　Information andCommunication　Engineers　of　 Japan　and theInformation　Processing　 Society　of　Japan.

SATOSHI SHIRAI

Senior　 Research　 Engineer,　 in　 the　NTT

Communications　and　Information　 ProcessingLaboratories. Since joining the ECL system in1980,　he　has　developed　Japanese　analysissystems　 for　natural　language　 processingsystems.　　 He　is　presently　developing　a

machine　translation　 system.　　He receivedbachelor's　and　 master's degrees from OsakaUniversity in 1978 and 1980.　 He is a memberof the Institute　of Electronics, Informationand Communication Engineers of Japan, and theInformation　Processing　Society　 of　Japan.

AKIO YOKOO

Research　 Engineer　 of　the　 NTT　 NaturalLanguage　Processing　Laboratory　in the　NTT

Communications　and　Information　 Processing

Laboratories. Since joining the ECL system in1982, he has developed a frame representation

language　and　a　natural language processingsystems.　　 He　is　presently　developing　a

machine　translation　system.　 He received a

bachelor's　in　1980　and master's　degree in1982　　from　 the　 University　of　Electro-Communications.　　He　is　a　member　of　theInstitute　 of　Electronics,　Information andCommunication　　Engineers　 of　 Japan,　theInformation　Processing Society of Japan, andthe Japanese Society Artificial Intelligence.