|
The Constructive Process Theory and
the Multi-Level Translation Method
|
|
|
NTT Information Network Systems Laboratories
Satoru IKEHARA
|
List of Abbreviation
L :Language
E :English
J :Japanese
JL :Japanese Language
pre-E :pre-editing
J-to-E :Japanese to English
CPT :Constructive Process Theory
NL :natural language
NLP :natural language processing
CLs :computational linguistics
SWD :semantic word dictionary
SSD :semantic structure dictionary
pvt-M :pivot method
trs-M :transfer method
MP :meaning processing
MA :meaning analysis
MU :meaning understanding
SE :subjecyive expression
OE :objective expression
|
|
|
○ Thank you very much, Mr. Chairman, for your kind words of introduction.
○ I am also thankful to Professor Rigsby and the Organizing Committee▼for the
opportunity to visit this beautiful country▽and to get to know my
Australian and other colleagues better.
○ I am particularly honored to be here▽to speak at this Anniversary of your
Society.
● Today I would like to speak about CPT and MLT method.
@ I think it would be appropriate to start by some introductory remarks about
myself▼and the research efforts exerted on NLP by our Laboratories.
A Almost 20 years have elapsed▽since I joined the Electrical Communication
Laboratories at NTT.
・ The first 10 years were spent in research work on▼Form. Al.Mani. and traffic theory, with emphasis on QN.
・ It is therefore 10 years▽since I started research work on NLP.
@ NTT currently has 13 laboratories.
A Research on CLs have been conducted for long time▼at basic research
laboratories in Musasino city.
B Research on NLP were started▽at the Laboratory in Yokosuka 10 years ago.
・ I was involved in this research effort from the very beginning.
・ From a staff of three in the beginning,▼the number of researchers has grown
to over 30 persons.
C In the area of NLP,▽I have undertaken research efforts in▼
・J text to speech, ・key word extraction,
・error detection and correction of J text,
・and statistical approaches for voice recognition.
・ This research effort continues on today.
D I am now engaged mainly in J-to-E MT.
-----------------------------------------------------------------------------
@ While research work on CLs and NLP resemble each other,▼they are not the
same and it would be adviseable to handle them separately.
A My current research efforts have been concentrated on NLP and not on CLs.
・ It is not research on how a NL can be transcribed by a theoretical model.
・ It is research on how can J, that is actually in everyday use, can be
translated into understandable E.
・ My research efforts are directed toward realization of this method.
B Upon starting research work on MT,▼a major problem was a considerable gap
between CLs and NLP.
C It can be stated that▽if we apply the results of CLs to actual every day
words,▼there arise many exceptional cases and appropriate processing is
limited.
-------------------------------------------------------------------------------
× It may be that if CLs would achieve further progress and development, these
problems would cease to exist.
× But can we afford to bide our time and wait for such an event before
proceeding any further with our MT system ?
× Our Laboratory is an institution engaged in information processing.
× In conducting MT research, regardless of the method, the final objective is
to make possible the translation of actual words.
× To await the development of CLs, would make it difficult to start MT research.
× We would be denied the budget for this purpose.
× It was no easy task for us to obtain approvals for conducting research on MT.
--------------------------------------------------------------------------------
@ NL is something like a custom or habit▼that has spontaneously grown up
within its own group of the L.
・ It reflects the very human society▽in which it was born.
A It is no easy task to explain each individual phenomenon▼without conflict by a unified model.
B If Ls are a social custom, there should be a method of processing that
matches them.
C Researches of NLP require the pursuit▼of every one of these words.
@ I have decided to begin research work▼from the viewpoint of NLP and not CLs.
A Accordingly,▽I have sought a new approach amidst the results of
conventinonal linguistics,▼promoting a new methodology.
B The background for my research work is▼the CPT of L by Motoki Tokieda,▼who
followed up on the work▽initiated in the 18th century by Norinaga Motoori.
・ Our MT research efforts have been based on this theory▼and has resulted in
a MLT method▼and a test product J-to-E experimental MT system.
@ Today,▼I would like initially to introduce the outline and experimental
results▼of our MT system in the half of my talk.
A In the latter half,▼I propose by describing our thinking and methods▽making
comparisons with the thinking behind CLs.
-------------------------------------------------------------------
本論開始
@ Let us start with the relation between our puporse and the ccurrent methods
in Japan.
MT進歩
A As you are aware,▽MT methods have advanced▼and many commercial systems have
appeared recently in Japan.
B Our major concern, at this stage, is J-to-E translation.
日英違い
C JL is very different from European Ls.
・ The quality of J-to-E translation is not good▼compared to French to E or
E to Russian translations.
・ Current methods can be classified as a word-to-word translation,▼or literal translation.
明治の変革
E JL itself has been enhanced by importation of European culture in the M-Era.
・ Therefore, JL has certain expression patterns▼that are easily translatable
into European Ls.
・ These parts can be translated by current methods.
前編集
○ However,▽conventional J expressions are required to be rewritten into easily
translatable J.
D That is, pre-E is inevitable▼in order to apply the current MT system to
practical use.
NTTの目的
F The purpose of our research is to realize communication services with
translation. ・ We hope to realize direct communications.
・ However,▼current methods are not satisfactory in realizing such services▼
because of the need for pre-E.
G Thus, our aim is to achieve translation without pre-E.
限界
H Let's look at the manner in which pre-E ia conducted▽to know about the
limitations of current methods and their causes.
-------------------------------------------------------------------
原文の書き方
@ How should original sentences be rewritten before translation ?
A Let us look at the examples in this figure.
・ The four essential points of pre-E are shown.
単語訳し分け
B First, every word should have only one meaning.
・ This example shows the J verb 「kakeru」▼having, in this case, six different
meanings,▽pour, place, build, sit, mop, and play.
・ Such J words need to be rewritten into other words▼which correspond to a E
word,▼but there are cases which present difficulties in rewriting.
形容詞の例
C Second, structures of sentences and expressions should be simple.
・ In this example, the adjective 「utsukushii」 meaning beautiful,▼is placed
in front of the pronoun (I, me or my).
・ This can resault in two interpretations,▼"beautiful me" and "beautiful
daughter".
・ To avoid these ambiguities,▼modifiers should be placed immediately in front
of the word to be modified,▼as in this rewritten example.
省略と補完
D Third, supplementation of elliptical subjects and objects will be required.
慣用隠喩
E Fourth, idiomatic expressions and metaphores need to be rewritten into other expressions.
結論
F It can be stated that▼these requirements have derived from limitations of
literal translations.
----------------------------------------------------------------
2方式
@ Let's consider the translation methods.
A Conventional translation methods can be classified into two types, the pvt-M
and the trs-M.
B Among researchers of MT, there have been arguments as to which method is
superior.
ピボット志向
○ Among the translation systems developed by private companies, there are many
which seek to achieve the pvt-M.
・ However, there have been none that can be stated to have truly realized the
pvt-M.
・ This is because universal intermediate L that can be commonly applied to many L cannot be designed.
言語とは
○ NL has primarily originated by reflecting the perpsectives and thinking of
the group of people using such L.
・ It would appear unreasonable to consider a intermediate L that would be
commonly applicable to every L which differs in thinking and perspective.
ピボット
○ When a translation between Ls which are very close each other is considered, or if a rough ranslation is to suffice, the pvt-M may be satisfactory.
・ But in either case, I think pvt-M has a basic misunderstanding of Ls in its
background.
トランスファー
@ The trs-M can be stated as being more realistic compared to the pvt-M,
because it does not assume that intermediate L is universal.
A However, if intermediate L can be thought of as a meaning expression such as deep structure, the same difficulties as in the case of the pvt-M would arise.
B The meanings of surface structure cannot necessarily be retained by deep
structure.
-----------------------------------------------------------------
人手翻訳
○ Let us consider the manual translation method.
表現と認識
@ In the expression that is to be translated, the state of the objects as
recognized through the eyes of the speaker is tied with the expressions.
・ And at the sane time, the awareness of the speaker toward the object is also
tied with the expression.
A To tie the speaker's recognition with the expression, a rule of words,
i.e. linguistic norm, is used.
・ This norm differs with every L, such as in E and in J.
人手翻訳
○ A human translator experiences the speaker's recognition through the
expressions.
・ He is required to know both the state of object and speaker's recognition
thereof.
・ These contents are expressed in the framework of the target L▼on behalf of
the speaker.
図式
@ This figure shows▽what is objective in this path and the what is subjective
in this path.
A Thus,▼translation is conducted through a re-grasping of the speaker
recognition.
B A simulation of this process is the MLT method which we proposed.
-------------------------------------------------------------------
多段翻訳方式
@ This figure shows the MLT method.
A The source L is J, and the target L is E.
B The term "Multi-Level" indicates that▼translation is being conducted at
various levels of abstraction.
パスの説明
@ This system is divided up into 2 major conversion paths.
A The one is the conversion of the speaker's sense or what is subjective.
B The other is the path for conversion of the description for the object.
・ And this path splits up into 3 more paths,▼depending on the abstraction
degree in conceptualizing objects.
C Thus, the MLT method has currently four translation paths.
従来との違い
○ The major difference between this method and the conventional methods is▼
whether the speaker's recognition and the object are regarded separately
or not.
-------------------------------------------------------------------
知識が本質
○ To realize a MT system, a knowledge of L is essential.
曖昧さとの戦い
○ NLP can be stated as being a battle with ambiguity▼from the beginning to end.
・ In the translation process,▼various types of ambiguities arise,▼but this is
regarded as being a lack of knowledge that is required.
知識
@ To overcome such ambiguity, it is necessary to study what knowledge is
lacking and to have such knowledge established as the rule or as a dictionary.
A It is important to consider the type of ambiguity and relationship▼with
the knowledge corresponding to it.
方針
○ In our MT system,▼we place importance on knowledge for MP▼for the purpose
of solving ambiguity of sentence structure▼and of the meaning of individual
words.
我々の方法
○ We have collected 2 major types of information▼and have compiled them into
a form of dictionary▽as shown in the figure.
概念化
@ When the speaker expresses an object, abstraction is conducted and the object is conceptualized.
A Details will be mentioned later, but in L expressions, the concept of object
is expressed separately in terms of a concept of substance and a concept of
attribute.
辞書化
○ Linguistic conventions which combine concepts to expressions have therefore
been compiled into 2 separate dictionaries.
単語意味辞書
○ The one is knowledge related to semantic use of words.
・ This has been compiled in a SWDic.
構文意味辞書
@ The other is knowledge related to meanings of expression structures.
A The meanings of expressions are not necessarily represented by the sum of the
meaning of every word in the expression.
B In fact, it would appear thet the meanings of linguistic expressions actually used cannot be explained solely by the meaning of each word used in the
expression.
C I believe that▼expression structures need to be considered▽as units of
meanings in NL.
D Based on such thinking, the abstraction of expression structures centered
around declinable words and compiled as units of meanings▼has resulted in
the SSD.
E This is one of the important points asserted in our system▼and I shall be
dealing with this factor later in this presentation.
結論
○ I would like to state clearly that▼the knowledge we have discussed here is
a L knowledge▼and not a worldwide knowledge such as general commonn sense
nor some knowledge of specialists.
-------------------------------------------------------------------
要約
○ As discussed in the foregoing,▼we have compiled a SWD and a SSD.
意味属性体系
○ The meanings of words and their translation are listed in ordinary
dictionaries.
・ But it does not serve to have the computer know▽how they are to be used.
・ For example,▼the word "school" is used as a "place"▽and also as an
"organization".
○ As a means of desscribing these knowledge,▼we have compiled the use of the
meaning of words as a SA system.
知識の精度
@ It is important to consider precision of knowledge description▼in relation
to the degree of ambiguity.
A For example,▽an algorithm which was valid in a world of 1,000 words▼may not
necessarily be valid in a world of 100,000 words.
・ Experiences indicate that▼increase of the number of words causes rapid
growth of ambiguities.
B It is a rule of science that▼quantity gives rise to changes in quality.
C We need to keep this rule in mind, especially in NLP.
単語数
@ Seeking to establish practical method, we have collected words including
proper nouns normally used day to day and have compiled a dictionary of
about 400,000 words for our system.
属性数
@ Thus, it is necessary for SA system to work with this environment.
A We began with about 500 categories.
・ But this has resulted in that translation of verbs cannot be precisely
described.
B With the consideration of our aim and environment,▼we set up 2,800
categories for general noun SAs▼and 200 categories for proper noun SAs.
辞書項目
@ Working with these SAs,▼we have compiled a SWD consisting of 400,000 words▼
and a SSD consising of 15,000 sentence forms.
A The SSD is in a Valentz pattern form▼and the index words are verbs and
adjectives.
------------------------------------------------------------------------------
一般名詞
@ This figure shows the nodes▽for the top 4 levels▼of the SA system for
general nouns.
・ General nouns are branched off into concrete and abstract,▼with concrete
branching off into subjects, places, and concrete objects.
A There is a total of 2,800 category names.
B Here, I have taken one example at the deepest level.
・ The depth of category amounts to 12 levels at the deepest level.
----------------------------------------------------------------
概要
@ Next we have the SA system for proper nouns.
A Attributes for proper nouns are divided into 4 types,▼the names of persons,
of places, of organizations and other.
・ These are sub-divided further.
------------------------------------------------------------------------------
×人名と地名
× Names of persons will stop at about the 4th level, but names of places go
further reflecting on the difference in use of names of persons as opposed
to places in J sentences.
× Names of places are used in the form of "nesting" such as prefecture country, ,county, towns and aza (local sub-division) thus accounting for depth of categorization.
-------------------------------------------------------------------------------
ノードの深さ
@ The number of nodes amounts to 200 and the maximum depth is 9 level.
A An example of the deepest node is shown here.
----------------------------------------------------------------------------
導入
@ Now, take a look at how these SAs are located in the SWD.
A This figure shows the contents of one record within the SWD.
レコード形式
@ The index word is "Tokyo".
A One record consists of 350 Bytes and some 200 to 300 various types of
information have been input into a record.
B SAs of words have been registered in the final 15 fields.
属性付与レベル
@ In the SWD, each word has been given a lower level of attribute.
A Conversely, with the SSD, higher level attributes have been used.
B This is due to consideration for retaining the relationship between higher
and lower level attributes.
・ Expression structures should be defined by generalized format as far as
possible.
------------------------------------------------------------------
比較
○ At this point,▼we would like to see the differences of description
capabilities▼between of our SA systems and other similar systems.
ケース1
○ Case 1 is a typical example of a J-to-E MT system which has recently been
commercialized.
・ The number of semantic categories are between 30 to 50.
ケース2
○ Case 2 is an example of the dictionary of word concept▼being planned by the
Japan Electronic Dictionary Research Institute (EDR)▼established by MITI
(Ministry of International Trade and Industry) of Japan.
ケース3
○ Case 3 is an example of our translation system ALT-J/E.
比較方法
@ A comparison was made of the three cases.
・ And, the capabilities of describing the translation rules for verbs are
evaluated.
A In the comparison, the case of ALT-J/E has been assumed to be 100%.
B The dictionary in the case of EDR is just in the planning stage so that the
values for Case 2 have been obtained from experiments conducted in our system based on similar conditions.
結果
○ The results of this comparison are as follows.
○ The capabilities of description was 31% for case 1 and 59% for case 2.
@ Those results indicate that▼when precision levels of SAs are low, essential
rules cannot be written.
A Our experiences also indicates that,▼in J-to-E MT,▼precision level for some
3,000 types of SAs are necessary▼for differentiation of translation for verbs.
B However, this is not sufficient for translation of nouns.
・ I think that some other different system from the SA is required for
differentiation of translation in the case of nouns.
------------------------------------------------------------------
導入
○ Let me show▽what kind of translation has become possible by our system.
「掛ける」の例
@ The first example is differentiation in the translation of the J verb 「kakeru」.
・ This is the same example as shown in the beginning of Pre-E.
A The verb in the J sentences are all the same 「kakeru」.
・ But, the translation by ALT-J/E shows▼poured, made, caused and other
different words.
B J verbs in general▽need to be translated into many different E words.
C In the case of 「kakeru」,▼there are about 15 different ways of translation
for general use▼and to include idiomatic usage,▽there will be a need to
consider some 80 types of translations.
「する」
○ The verb with the largest number of variations would be 「suru」▼with over 300
different ways of use.
・ This verb resembles the E verb "do".
「群れ」の例
@ The second example shows differentiation in the translation of the noun 「mure」.
A The same word 「mure」 is used 8 times in the J text▼but the translations are
different for each case.
B In this case,▼the E word varies depending on what is referred to.
------------------------------------------------------------------------------
× According to a native E speaker who was interviewed, the word "bevy" here is used only in the case of beautiful women.
× It was also pointed out that "mod" sounds strange.
× It would appear that we need to change our rules somewhat.
× Here, it would suffice if our audience can appreciate that differentiation
needed in the translation of nouns has also progressed to certain level.
---------------------------------------------------------------------------
例題
○ The next is an example of an automatic rewriting function of J sentences.
第1の例:自動書換
@ Here is a proper J sentence.
A The literal translation of this sentence is=(英語を読む)
・ I think this E may be understood,▽but it does not sound like proper E.
A The reason for this can be seen clearly▼as having excessive number of verbs.
日英の違い
○ J is a L of "circumstantial logic",▼and it necessarily uses many verbs.
・ In contrast, E is a L of "substantial logic"▼and is said to prefer
structural expressions.
動詞の削減
@ Thus, to reduce the verbs,▼a partial rewriting of the J text in the figure
is undertaken.
A This rewriting is performed by the translation system automatically.
B After automatic partial rewriting,▼we get the translation as =(英語を読む)
・ I think the quality of E has improved.
--------------------------------------------------------------------------------
×本来の目標
×@ It goes without saying that it is our goal to think through J texts that
are difficult to translate and to translate these into E that is readily
understandable.
×A Yet it is another alternative to change Japanese text that is difficult
into one that is easily translatable automatically.
×B This method is particularly convenient when a translation function that is
already completed can be applied without having to establish the function
anew.
------------------------------------------------------------------------------
==続く==
第2の例:補完
○ The next example is supplementation of subjects and objects.
日本語の習性
@ The J convention is to refrain from mentioning▼what is already known to the
listener.
A Lengthy, redundant expressions are not popular.
B Particularly, the subject and object are omitted unless absolutely necessary.
・ Because▽these can generally be judged by the partner▼by joshi and auxiliary
verb expressions.
英語の習性
@ In contrast,▼in E,▽both subject and object are generally necessary.
A There are cases▼when the subject is missing and the object is turned into
the subject▼and rewriting into the passive voice.
B But this is not always appropriate.
補完例文
@ Here are two sentences which are linked to one another. (日本語を読む)
A The first sentence has the subject and object,▼but the second sentence has
two subjects omitted.
B Therefore,▼ALT-J/E has supplemented the first subject by the object of the
first sentence.
C And the latter subject is supplied by the subject of the first sentence.
まとめ
○ I have shown some of the new functions that have been realized by ALT-J/E.
結論
○ I believe that,▼with the realization of these new functions,▼the
possibility of MT without pre-E▼is now real possibility.
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
@ The foregoing is the initial portion of my presentation.
・ I have introduced the MLT method and our MT system ALT-J/E.
A In the latter half of my presentation,▼I would like to mention details of
our conceptual background.
B If there should be any questions on the discussions up to this point, I shall be happy to answer them.
----------------------------------------------------------------------------
@ I shall begin the latter portion of my presentation.
A I have already mentioned that▼our research efforts are being conducted from
the viewpoint of NLP, not CLs.
B Let me discuss the thinking behind our MT system.
背景
@ The background to our research lies in the CPT of L.
A This theory was advocated some 50 years ago▼by a J linguist, Motoki Tokieda.
B Tokieda took after the J grammar▼which was established some 200 years ago by Norinaga Motoori,▼and introduced the CPT from a standpoint of▼criticizing
the philology of Ferdinand de Saussure.
話の順序
@ First, I wish to present the differences▼between the concepts of CLs and of
CPT.
A I shall follow by discussing▽how the CPT is related to our MLT method.
-------------------------------------------------------------------------------
本論開始
@ Let's consider the background of the CLs, first.
A In considering logics,▽there has been two types of logics since the olden
times.
弁証法論理学
@ The one is dialectics.
・ Dialectics was advanced by Hegel.
A This type of logics asserts that▼actual existence should be thought of as a
meaning theory.
・ Hence, it deals with contents.
形式論理学
@ The other type is formal logics.
・ This type of logics has appeared in the era of ancient Greece.
A This logics insisted on purity and handling of an ideal world.
・ And it cut itself off▽from the realistic world.
記号/計算論理学
○ Symbolic logics and computational logics are derived from the latter, formal
logics.
自然言語
@ There appears an effort to apply CLs to NL.
A But NL is a L▽that deals with the realistic world.
B This would seem to be the cause of various problems.
・ For example,▼Frege's principle, which is compositional semantics and
constitutes a major assumption in CLs,▼is not valid in a NL.
----------------------------------------------------------------------------
-------------------------------------------------------------------Fig.15
@ Let us look at the differences between the J and Ind-European Ls.
欧米語
A The Ind-European Ls have sentence patterns such as 5 patterns▼which are
relatively independent from contents.
・ These Ls can be classified as a suitcase type L.
・ Structualism was developed▼from the viewpoint of importance of forms and
structures of expressions.
・ Chomsky took notice of the importance of contents to overcome the limitation of structualism.
日本語
B In contrast,▼JL is relatively free from forms▽and does not have a rigid
sentence structure.
・ J can be classified as a 「furoshiki」(wrapping cloth) type L.
・ Forms are dependent on contents.
・ In other words, the shape of contents appears in a form of expression.
C Thus, contents are important in J processing.
日本語文法
D In J, there are 20 different types of grammars.
・ And four of them are well known.
○ Here, I would like to take up two▽out of the four.
・ One is Hashimoto grammar.
・ This grammar is strongly influenced by European philology.
・ And this grammar was adopted for use in school▽by the Ministry of Education
of Japan.
・ Another grammar is Tokieda grammar▽which was derived from Norinaga Motoori.
時枝文法の考え
E As nature is a complexed body of processes▼that develops with conflicts as
the motivating force,▼so also is L.
F Tokieda established the CPT of L▽based on this idea.
----------------------------------------------------------------- Fig. 16
@ This figure shows the differences▼between the CPT and Chomsky's concept.
言語過程説
A CPT asserts that▼a L should be considered as a complex body of processes▼
comprised of objects, speaker recognition and expressions.
B Objects reflect recognition.
・ This process is explained by the reflection theory.
C #Recgnts are combined into expressions using norms,▽namely the rule of Ls.
・ #Ling-norms are national rules▼which have been fostered in each community.
・ These norms are referred to as a grammar▼in the broad meaning of the word.
チョムスキー
D In contrast,▼Chomsky interpreted L in the form of a dual structure,▼a
surface structure and a deep structure.
E The existence of deep structure has been assumed▼as the meanings of surface structure.
・ This is certainly a strange theory▽where because meanings are assumed for
the purpose of explaining meanings.
・ It would further appear that there would be difficulty in explaining the
existence of deep structure.
最近のチョムスキー
・ More recently,▼he appears to be attempting explanation based on mystic
concepts.
・ He says▼L cannot be explained without assmuming that▼humans are born with a
basic linguistic ability.
因果律
F Well, results are achieved from causes▼and these results, in turn, become
the causes for the next series of results.
自然と言語
・ Nature is a composition of such processes▼and it is the conflicts and
discrepancies which activate these processes.
・Conflict is energy. === ・I believe this is the thinking followed by science.
知的産物
・Language is a spiritual by-product of humans.
・There is room for various theories to arise▼based on how the human spirit is to be considered.
----------------------------------------------------------------- Fig. 17
○ Let us now see▽what happens▼when we regard L as a causal sequence▽between
object, recognition and expression.
対象世界
○ The world of objects consists of a speaker and other objects.
・ Objects are composed from substances, attributes and relations.
B The speaker recognizes the object world▼and at the same time, recognizes
things about himself.
・ In recognition of objects,▽conceptualization takes place.
自己分裂
○ There are two occasions regarding recognition of self.
・ One is the occasion where▼speaker represents himself directly in the
expression without conceptualizing himself.
・ On other ocassions,▼he conceptualizes himself and represent it as an object.
○ When he conceptualizes himself,▼the phenomena of self disunion take place.
・ A separate imaginary self other than the actual self appears.
・ And the self conceputualizes the actual self.
主体的|客体的
○ Recognition related to subjects and objects is represented▼by two types of
forms in linguistic expression.
・ One is a SE and the other is an OE.
C SEs represent speaker emotions and intentions.
・ These expressions are expressed in the JL▼by post positionals and adverbs.
D OEs represent conceptualized objects.
・ And these expressions are generally expressed by nouns and verbs.
英語の場合
E This relation is somewhat different in E.
・ Inflections usually express SEs in E.
------------------------------------------------------------------ Fig.18
@ Let us take a look at this example. == 日英単語対を読む==
A This sentence means▼"He will also want to go to America".
B "He, America, go" are OEs.
C "also, to, want, will" are SEs.
ポールロワイヤル
・ A similar view is witnessed in the grammar of Port Royal in France▽300
years ago.
D The difference is an important factor in the translation method.
・ I plan to mention this later in relation to the MLT method.
------------------------------------------------------------------
● Here, let us proceed to the problem of meanings.
@ Recently, the importance of semantic processing in NLP is▼being strongly
emphasized.
・ Much efforts have been devoted to semantic or meaning processing,▼but it is
felt,▽in many cases, the definition of meaning is ambiguous.
○ I would like to point out here what we regard as meaning.
意味論の分類
A Let us consider the structure of Ls.
・ If we assume that meaning is a substance,▼it is one of either "object",
"recognition" or "expression"▼and the fourth element of "interpretation".
形式と意味
・ If we say, at this point, that▼meaning is born out of forms,▼we would
arrive at a "formal semantic theory".
・ But this would be ideological.
B Next, if we say that forms are born out of contents,▼this would result in a "object semantic theory" or "recognition semantic theory".
対象と意味
C If we assume that objects and recognitions are meanings,▼how should we
explain the sentence that has been erroneously written.
・ This would result in a sentence that is incorrect▽in which something that is opposite is written,▼yet stating that the meaning is correct.
D Further,▼the object and recognition can never remain the same indefinitely
and will undergo changes.
・ Then, the meaning of the sentence will change independently from expressions.
時枝説
E Motoki Tokieda chose eclecticism and took meanings to be listener's reactions.
三浦説
・ Tsutomu Miura took after the Tokieda grammar,▼but improved the theory and
advocated the "relation meaning theory".
F He proposed that▼structural processes from object to recognition are
combined to make an expression.
・ He explained the meanings of linguistic expression as the relationship▼
between object, recogntion and expression.
G This kind of relationship is concrete and specific.
・ As long as the expression exists,▼so will the meanings.
・ When expression is cancelled out,▽so will all relationships,▼and the
meanings are also lost.
状況意味論
H The concept of regarding relationships as meaning▼resembles recent situation semantics,▼but actually, the two are entirely different.
I In situation semantics,▽the meanings of expressions▼and the meanings re-
lated to the location in which the expression is placed▼are completely mixed up.
・ The Miura grammar features a clear distinction▼between linguistic expression and expression of locations.
----------------------------------------------------------------- Fig. 20
○ How do we go about MP▼from the viewpoint of "relation meaning theory".
○ Meaning is the relationship between object, recognition and expression.
意味の要約
○ SE involves the emotions and intentions of the speaker,▼whereas OE expresses the status of the object▼as viewed from the eyes of the speaker.
・ Thus, MU would involve re-tracing such relationship▼and to relive and
experience the status of object and recognition.
2つの意味処理
C Here, let us think of MP in two steps.
意味解析
D The first step would be the process▼which designates the rules or
conventions used in linguistic expression.
E Conventions regarding Ls are complex and liable to be construed in many ways.
・ The listener must identify the convention used by the speaker.
・ I would wish to define this act of identifying the convention as "MA".
意味理解
F The second step would be the process of identifying the recognition of the
speaker▼and the status of the object that is tied in with the expression.
・ Here, we would like to refer to this action as "MU".
まとめ
G Therefore, MP consists of two processes,▼"MA" and "MU".
知識
H The knowledge that is required for MA would be linguistic knowledge,▼that is, conventions or norms regarding Ls.
・ In contrast,▼MU would require knowing the status of objects▼and this would
require general knowledge, that is, worldwide knowledge and knowledge in
specialized fields.
I Let us go over the next example. ------------------------- Fig. 21 ----------------------------------------------------------------------------
@ This diagram is an example of MA. =日本語と英語を読む=
上の文
○ The meaning of the sentence is ------------.
○ Let us think about the convention involved in these words.
高い
○ The meaning of the word "takai" is expensive, high, noble, loud among others▼ but here, it is used with the meaning "expensive".
油
○ "Abura" has various conventions as shown in the figure,▼but here the
convention expressing the meaning of oil is used.
言語解析とは
○ Thus,▼MA is identification of the convention▼which the speaker actually
used among numerous conventions▽that exist in the L.
・ MA cannot be achieved▼by merely staring at the words one by one.
下の文
@ Let us look at the lower sentence. ==日本語を読む===
A Look at this portion of the sentence. 背の高い meaning the "back is high".
・ In E, one word "tall" is used to convey this meaning.
・ But looking at back and high separately▼will not bring the concept of tall.
B The same can be said of the other expression.
・ Sell and oil jointly means idle away one's time in this case.
C Thus,▼conventions regarding words can never be decisive▼if they are thought
out word by word.
まとめ
A I said that▼the meanings of expressions are the relationship between objects, recognition and expressions.
B Structrures of objects reflect themselves into speaker recognition.
・ These structures are also combined to expressions.
構造と意味
@ This means that▼the structure of an expressinon is a part of the meaning.
・ Structures cannot be separated from meanings because structures also
represent meanings.
A Thus, in MA also,▼there is the need to think the relationship between
structure and meanings.
・ More specifically,▼it is important to grasp the structure of expressions as
units of meaning.
------------------------------------------------------------------ Fig.22
○ Next, let us consider about MU.
@ As discussed in the foregoing,▼MU involves the identification of the
speaker's recognition▼and the status of object as tied in with the
expression.
理解
A To know the speaker's recognition and the status of object,▼the listener
must be equipped with a certain world in his mind.
・ This world must have certain elements in common with the world▼which the
speaker is depicting.
・ To restructure a world▽corresponding to the expressions of the speaker,▼he
will be required to link up the elements of the speaker's expressions▼and
the elements of his own world.
B In other words,▼MU can be considered as linking the elements of the
linguistic expressions▼and within the world model of the listener.
・ This action of linkage will structure a new portion in the listener's world.
知識
C In contrast to liguistic knowledge for MA,▼MU will require worldwide
knowledge.
D Compared to linguistic knowledge,▽worldwide knowledge is massive▼and
considerable amount of difficulties will be involved in research.
・ We are therefore conducting our reseach effort based on the following
concepts.
----------------------------------------------------------------- Fig. 23
@ We shall discuss MP in MT by dividing it into MA and MU.
意味解析とMT
A The relationship between our MT research and MP is as follows.
知識
○ In MT, the contents of source text to be translated are diversified▼and it
would be difficult indeed to prepare a worldwide knowledge.
MTの方針
D MT can be acceptable▼even if the computer cannot entirely understand the
contents of the text▼as long as the translation results can be finally
understood by humans.
E Replacing the linguistic conventions used in the JL▽by E conventions▼and
leaving it up to the judgment of the average person.
・ This is the basic stance assumed by our research efforts in MT.
ステップ
○ First,▼we want to realize a translation based on MA.
○ Next,▼we will extract the expressions and concepts▽which could not be
translated by MA▽and attempt a MU with a limited world model.
意味理解は
○ In contrast,▼with areas such as telephone number information and database
inquiries,▼the target world can be contracted into a comparatively small
range.
・ We are structuring world models and conducting research in MU for such fields.
----------------------------------------------------------------- Fig. 24
MLTの構成
@ The method of translation based on the foregoing MA▼is the MLT method.
A I have previously mentioned the need to observe differences▼between SE and
OE.
B And, I have also pointed out the necessity of consideration▼about the
meaning of structures.
C The abovementioned translation method has been structured based on these
two factors.
第1の点
D The first point is considered as follows.
・ J and E are very different in presenting SEs.
・ Therefore,▽SEs are segregated from OEs▼and converted into E.
第2の点
E The second point is resolved by abstraction of expression structures.
・ Three levels of abstraction have been proposed.
・ The first is related to a level of idiomatic expressions.
・ The second concerns specific structures such as loosely coupled words.
・ The last concerns most loosely coupled structure that can be represented by
general rules.
主体/客体の扱い
F From the foregoing,▽MLT method consists of two major structures.
○ From the viewpoint of linguistic knowledge,▼linguistic conventions are
divided into those▼which are related to SEs and those related to OEs.
G Of the two,▼the knowledge related to SEs are smaller in scale compared to
OEs▼and have been solved in terms of rules or programming.
・ But the OEs being massive,▼we have decided to compile them in the form of a
dictionary.
H We shall next explain the method of summarizing linguistic knowledge▼related to objectve expressions.
---------------------------------------------------------------- Fig. 25
世界の構成
@ I have already explained that▼the target world of linguistic expression is
structured by the object▼consisting of substances, relations and attributes▼ and by the subject which is the speaker himself.
・ Here,▼the objectified speaker himself is identical with substances.
・ Thus,▼knowledge for OE can be considered as consisting of three types,▼
substances, relations and attributes.
名詞
A In terms of linguistic expression,▼substances and relations can be expressed by nouns.
・ Thus,▼the knowledge concerning use of nouns has been summarized in a SWD.
属性
B Regarding attributes,▼there are a dynamic type and a static type▼which are
represented by verbs and adjectives, respectively.
・ Thus, these have been summarized in a SSD.
まとめ
C As mentioned in the earlier portion of my presentation,▼linguistic knowledge has been summarized based on these ideas.
×シソーラスとの違い
× As an item similar to the SWD, there is thesaurus.
× In E, the thesaurus by Roget is famous and there are several examples in
Japan.
× These have perceived the similarity between the meanings of individual words
and have made classifications accordingly.
× Our word semantic dictionary features the viewpoint on which the word is
based and used and has thus added a semantic role.
----------------------------------------------------------------- Fig. 26
@ I have presented an outline of the J to E MT system, ALT-J/E▼and the related conceptual background.
現状
A The current state of NLP research at our research laboratory▼can be summed
up as follows.
形態素解析
B First, morphological analysis▼which constitutes a considerable stumbling
block in the JL.
・ I think the complexity is not less than that of syntax analysis or meaning
analysis.
C Technology involved has been almost completed.
・ A remarkably high level of precision and accuracy has been achieved.
構文解析
・ Syntax analysis and MA are being undertaken in the course of MT research.
・ High levels have been achieved,▼but I do not believe,▼it is satisfactory
yet to be applied to newspaper article translation.
意味解析
C Research on MU has been initiated in question answering.
・ This will require more time and effort.
---------------------------------------------------------------- Fig. 27
@ The main issues confronted by MT research▼and future research targets are as presented in the figure.
当面の課題
A Current subjects are ....==No.1 を読む==
・ Our major research efforts are being concentrated in these areas.
将来の課題
B Our future study efforts will be concentrated on ....==No.2を読む==
・ Preparations for these studies are now under way.
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
○ Today,▼I have mentioned the thought and method of our MT system.
○ I am hoping that these ideas will become one of the bases of non-literal
translation.
・ The problem of non-literal translation has begun to attract attention▼as
witnessed by the fact that▼it was taken up at the IJCAI Workshop in Sydney
last August.
○ If sharing our experiences in dealing with this system▼would be of interest
to you,▼it would be our pleasure to do so.
○ This concludes my presentation,▽and once again,▼my heartfelt appreciation
for this opportunity.
・ I thank you.
--------------------------------------------------------------------------------
コメント
inscription → description ??
consider, concieve, perceiveの違い
図20はここからスタート(40分番組の時)
● Here, let us proceed to the problem of meanings.
・ I can't discuss details of meaning theory here.
・ However, we have adopted Miura's theory which assert meanings are
relationship between object, recognition and expression.
・ This theory is called "relation meaning theory".
H The concept of regarding relationships as meaning▼resembles recent
situation semantics,▼but actually, the two are entirely different.
I In situation semantics,▽the meanings of expressions▼and the meanings
related to the location in which the expression is placed▼are completely
mixed up.
・ The Miura grammar features a clear distinction▼between linguistic
expression and expression of locations.
図24の1文差し替え
I have not discussed the details of structural meaning of expression.
However, I must point out here the necessity of consideration about the
meanings of structure as the second point of our translation method.